Atomic handling for disaggregated 3D structured SoCs

US12373348B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12373348-B2
Application numberUS-202117551681-A
CountryUS
Kind codeB2
Filing dateDec 15, 2021
Priority dateOct 7, 2021
Publication dateJul 29, 2025
Grant dateJul 29, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In a further embodiment, a system on a chip integrated circuit (SoC) is provided that includes an active base die including a first cache memory, a first die mounted on and coupled with the active base die, and a second die mounted on the active base die and coupled with the active base die and the first die. The first die includes an interconnect fabric, an input/output interface, and an atomic operation handler. The second die includes an array of graphics processing elements and an interface to the first cache memory of the active base die. At least one of the graphics processing elements are configured to perform, via the atomic operation handler, an atomic operation to a memory device.

First claim

Opening claim text (preview).

What is claimed is: 1. A system on a chip integrated circuit (SoC) including: an active base die including a first cache memory; a first die mounted on and coupled with the active base die, the first die including an interconnect fabric, an input/output interface, and an atomic operation handler; and a second die mounted on the active base die and coupled with the active base die and the first die, the second die including an array of graphics processing elements and an interface to the first cache memory of the active base die, wherein at least one of the graphics processing elements is configured to perform, via the atomic operation handler, an atomic operation to a memory device accessible via the input/output interface, a second cache memory is configured to cache data associated with the atomic operation, and the second cache memory is associated with a burst buffer cache that is enabled when a rate of incoming atomic requests exceeds a burst rate threshold. 2. The SoC of claim 1 , wherein the first die includes a media engine configured to perform an atomic operation to the memory device. 3. The SoC of claim 2 , wherein the memory device is coupled with or accessible to the second die via the first die. 4. The SoC of claim 3 , additionally comprising device memory coupled with the first die. 5. The SoC of claim 4 , wherein the first cache memory is configured to cache accesses to the device memory. 6. The SoC of claim 5 , wherein the memory device accessible via the input/output interface is a host memory configured to couple with a host processor. 7. The SoC of claim 5 , wherein the memory device accessible via the input/output interface includes memory accessible via a compute express link protocol. 8. The SoC of claim 7 , wherein in response to a determination that the rate of incoming atomic requests exceeds the burst rate threshold, control circuitry associated with the second cache memory is configured to: adjust a cache replacement policy associated with the second cache memory to deprioritize eviction of modified cache lines; and allocate a cache line in the burst buffer cache to store data for an incoming atomic request. 9. The SoC of claim 7 , wherein the burst buffer cache is a reserved portion of the second cache memory. 10. The SoC of claim 1 , wherein the atomic operation to the memory device is a read-modify-write operation, and wherein: the atomic operation handler is configured to perform a read-for-ownership operation to obtain coherency ownership of data associated with the atomic operation in response to a request from the at least one of the graphics processing elements; the at least one of the graphics processing elements is to modify the data associated with the atomic operation; and the atomic operation handler is to perform a write operation to write modified data to the memory device. 11. A data processing system comprising: a system interconnect to facilitate communication with a host processor device, the host processor device coupled with a host memory; and a system on a chip integrated circuit (SoC) coupled with the system interconnect, the SoC including: an active base die including a first cache memory; a first die mounted on and coupled with the active base die, the first die including an interconnect fabric, an input/output interface, an atomic operation handler, and a memory interface to a device memory; and a second die mounted on the active base die and coupled with the active base die and the first die, the second die including an array of graphics processing elements and an interface to the first cache memory of the active base die, wherein at least one of the graphics processing elements are configured to perform, via the atomic operation handler, a first atomic operation to the host memory and a second atomic operation to the device memory, wherein the host memory is accessible via the input/output interface, a second cache memory is configured to cache data associated with the first atomic operation, and the second cache memory is associated with a burst buffer cache that is enabled when a rate of incoming atomic requests exceeds a burst rate threshold. 12. The data processing system as in claim 11 , wherein the first die includes a media engine configured to perform a third atomic operation to the host memory and a fourth atomic operation to the device memory. 13. The data processing system as in claim 12 , wherein the device memory is coupled with the second die via the first die. 14. The data processing system of claim 13 , wherein the device memory couples with the SoC and the first cache memory is configured to cache accesses to the device memory. 15. The data processing system of claim 13 , wherein the host memory is accessible to the second die via the first die. 16. The data processing system of claim 15 , further comprising an external memory device accessible via the input/output interface, the external memory device distinct from the host memory. 17. The data processing system of claim 16 , wherein external memory is accessible via a compute express link protocol. 18. The data processing system of claim 17 , wherein in response to a determination that the rate of incoming atomic requests exceeds the burst rate threshold, control circuitry associated with the second cache memory is configured to: adjust a cache replacement policy associated with the second cache memory to deprioritize eviction of modified cache lines; and allocate a cache line in the burst buffer cache to store data for an incoming atomic request. 19. The data processing system of claim 17 , wherein the burst buffer cache is a reserved portion of the second cache memory. 20. The data processing system of claim 11 , wherein the first atomic operation or the second atomic operation is a read-modify-write operation, and wherein: the atomic operation handler is configured to perform a read-for-ownership operation to obtain coherency ownership of data associated with the first atomic operation or the second atomic operation in response to a request from the at least one of the graphics processing elements; the at least one of the graphics processing elements is to modify the data associated with the first atomic operation or the second atomic operation; and the atomic operation handler is to perform a write operation to write modified data to the device memory or the host memory.

Assignees

Inventors

Classifications

  • System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package · CPC title

  • using burst mode transfer, e.g. direct memory access {DMA}, cycle steal (G06F13/32 takes precedence) · CPC title

  • Details of memory controller · CPC title

  • using clearing, invalidating or resetting means · CPC title

  • Atomic · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12373348B2 cover?
In a further embodiment, a system on a chip integrated circuit (SoC) is provided that includes an active base die including a first cache memory, a first die mounted on and coupled with the active base die, and a second die mounted on the active base die and coupled with the active base die and the first die. The first die includes an interconnect fabric, an input/output interface, and an atomi…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F12/0871. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 29 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).