Memory Controller with Programmable Atomic Operations
US-2021149600-A1 · May 20, 2021 · US
US12373348B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12373348-B2 |
| Application number | US-202117551681-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 15, 2021 |
| Priority date | Oct 7, 2021 |
| Publication date | Jul 29, 2025 |
| Grant date | Jul 29, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In a further embodiment, a system on a chip integrated circuit (SoC) is provided that includes an active base die including a first cache memory, a first die mounted on and coupled with the active base die, and a second die mounted on the active base die and coupled with the active base die and the first die. The first die includes an interconnect fabric, an input/output interface, and an atomic operation handler. The second die includes an array of graphics processing elements and an interface to the first cache memory of the active base die. At least one of the graphics processing elements are configured to perform, via the atomic operation handler, an atomic operation to a memory device.
Opening claim text (preview).
What is claimed is: 1. A system on a chip integrated circuit (SoC) including: an active base die including a first cache memory; a first die mounted on and coupled with the active base die, the first die including an interconnect fabric, an input/output interface, and an atomic operation handler; and a second die mounted on the active base die and coupled with the active base die and the first die, the second die including an array of graphics processing elements and an interface to the first cache memory of the active base die, wherein at least one of the graphics processing elements is configured to perform, via the atomic operation handler, an atomic operation to a memory device accessible via the input/output interface, a second cache memory is configured to cache data associated with the atomic operation, and the second cache memory is associated with a burst buffer cache that is enabled when a rate of incoming atomic requests exceeds a burst rate threshold. 2. The SoC of claim 1 , wherein the first die includes a media engine configured to perform an atomic operation to the memory device. 3. The SoC of claim 2 , wherein the memory device is coupled with or accessible to the second die via the first die. 4. The SoC of claim 3 , additionally comprising device memory coupled with the first die. 5. The SoC of claim 4 , wherein the first cache memory is configured to cache accesses to the device memory. 6. The SoC of claim 5 , wherein the memory device accessible via the input/output interface is a host memory configured to couple with a host processor. 7. The SoC of claim 5 , wherein the memory device accessible via the input/output interface includes memory accessible via a compute express link protocol. 8. The SoC of claim 7 , wherein in response to a determination that the rate of incoming atomic requests exceeds the burst rate threshold, control circuitry associated with the second cache memory is configured to: adjust a cache replacement policy associated with the second cache memory to deprioritize eviction of modified cache lines; and allocate a cache line in the burst buffer cache to store data for an incoming atomic request. 9. The SoC of claim 7 , wherein the burst buffer cache is a reserved portion of the second cache memory. 10. The SoC of claim 1 , wherein the atomic operation to the memory device is a read-modify-write operation, and wherein: the atomic operation handler is configured to perform a read-for-ownership operation to obtain coherency ownership of data associated with the atomic operation in response to a request from the at least one of the graphics processing elements; the at least one of the graphics processing elements is to modify the data associated with the atomic operation; and the atomic operation handler is to perform a write operation to write modified data to the memory device. 11. A data processing system comprising: a system interconnect to facilitate communication with a host processor device, the host processor device coupled with a host memory; and a system on a chip integrated circuit (SoC) coupled with the system interconnect, the SoC including: an active base die including a first cache memory; a first die mounted on and coupled with the active base die, the first die including an interconnect fabric, an input/output interface, an atomic operation handler, and a memory interface to a device memory; and a second die mounted on the active base die and coupled with the active base die and the first die, the second die including an array of graphics processing elements and an interface to the first cache memory of the active base die, wherein at least one of the graphics processing elements are configured to perform, via the atomic operation handler, a first atomic operation to the host memory and a second atomic operation to the device memory, wherein the host memory is accessible via the input/output interface, a second cache memory is configured to cache data associated with the first atomic operation, and the second cache memory is associated with a burst buffer cache that is enabled when a rate of incoming atomic requests exceeds a burst rate threshold. 12. The data processing system as in claim 11 , wherein the first die includes a media engine configured to perform a third atomic operation to the host memory and a fourth atomic operation to the device memory. 13. The data processing system as in claim 12 , wherein the device memory is coupled with the second die via the first die. 14. The data processing system of claim 13 , wherein the device memory couples with the SoC and the first cache memory is configured to cache accesses to the device memory. 15. The data processing system of claim 13 , wherein the host memory is accessible to the second die via the first die. 16. The data processing system of claim 15 , further comprising an external memory device accessible via the input/output interface, the external memory device distinct from the host memory. 17. The data processing system of claim 16 , wherein external memory is accessible via a compute express link protocol. 18. The data processing system of claim 17 , wherein in response to a determination that the rate of incoming atomic requests exceeds the burst rate threshold, control circuitry associated with the second cache memory is configured to: adjust a cache replacement policy associated with the second cache memory to deprioritize eviction of modified cache lines; and allocate a cache line in the burst buffer cache to store data for an incoming atomic request. 19. The data processing system of claim 17 , wherein the burst buffer cache is a reserved portion of the second cache memory. 20. The data processing system of claim 11 , wherein the first atomic operation or the second atomic operation is a read-modify-write operation, and wherein: the atomic operation handler is configured to perform a read-for-ownership operation to obtain coherency ownership of data associated with the first atomic operation or the second atomic operation in response to a request from the at least one of the graphics processing elements; the at least one of the graphics processing elements is to modify the data associated with the first atomic operation or the second atomic operation; and the atomic operation handler is to perform a write operation to write modified data to the device memory or the host memory.
System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package · CPC title
using burst mode transfer, e.g. direct memory access {DMA}, cycle steal (G06F13/32 takes precedence) · CPC title
Details of memory controller · CPC title
using clearing, invalidating or resetting means · CPC title
Atomic · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.