Efficient merging of atomic operations at computing devices
US-2018300846-A1 · Oct 18, 2018 · US
US11726918B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11726918-B2 |
| Application number | US-202117361145-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 28, 2021 |
| Priority date | Jun 28, 2021 |
| Publication date | Aug 15, 2023 |
| Grant date | Aug 15, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Dynamically coalescing atomic memory operations for memory-local computing is disclosed. In an embodiment, it is determined whether a first atomic memory access and a second atomic memory access are candidates for coalescing. In response to a triggering event, the atomic memory accesses that are candidates for coalescing are coalesced in a cache prior to requesting memory-local processing by a memory-local compute unit. The atomic memory accesses may be coalesced in the same cache line or atomic memory accesses in different cache lines may be coalesced using a multicast memory-local processing command.
Opening claim text (preview).
What is claimed is: 1. A method of dynamically coalescing atomic memory operations for memory-local computing comprising: determining whether a first atomic memory access and a second atomic memory access are candidates for coalescing; and coalescing the first atomic memory access and the second atomic memory access in a cache line allocated in an atomic coalescing state prior to requesting memory-local processing by a memory-local compute unit. 2. The method of claim 1 further comprising: determining that the first atomic memory access is a candidate for coalescing; allocating a cache line in the atomic coalescing state without loading data from memory; and storing an operand of the first atomic memory access in the cache line at a location targeted by the first atomic memory access; and wherein determining whether the first atomic memory access and the second atomic memory access are candidates for coalescing includes: determining that the second atomic memory access is a candidate for coalescing with the first atomic memory access based on the operand of the first atomic memory access and an operand of the second atomic memory access. 3. The method of claim 2 , wherein coalescing the first atomic memory access and the second atomic memory access in the cache line prior to requesting memory-local processing by a memory-local compute unit includes: coalescing the first atomic memory access and the second atomic memory access by performing an operation of the second atomic memory access, using the operand of the second atomic memory access, on data at a location in the cache line targeted by the second atomic memory access. 4. The method of claim 3 further comprising sending, to a memory controller in response to a triggering event, one or more memory-local processing commands for the first atomic memory access and the second atomic memory access. 5. The method of claim 2 further comprising determining, based on one or more metrics, whether to allocate the cache line in the atomic coalescing state for the first atomic memory access. 6. The method of claim 1 , wherein determining whether the first atomic memory access and the second atomic memory access are candidates for coalescing includes: determining whether the first atomic memory access can be coalesced with the second atomic memory access based on a symmetric access to different memory modules. 7. The method of claim 6 , wherein determining whether the first atomic memory access can be coalesced with the second atomic memory access based on a symmetric access to different memory modules includes: determining, in response to a triggering event, whether a first cache line that includes the first atomic memory access can be coalesced with a second cache line that includes the second atomic memory access, wherein the first cache line and the second cache line are in the atomic coalescing state. 8. The method of claim 7 , wherein determining, in response to a triggering event, whether a first cache line that includes the first atomic memory access can be coalesced with a second cache line that includes the second atomic memory access, wherein the first cache line and the second cache line are in the atomic coalescing state, includes: tracking cache lines that are candidates for coalescing. 9. The method of claim 6 , wherein coalescing the first atomic memory access and the second atomic memory access in the cache line prior to requesting memory-local processing by the memory-local compute unit includes: coalescing the first atomic memory access and the second atomic memory access using a multi-module memory-local processing command. 10. The method of claim 1 , wherein the memory-local compute unit is a processing-in-memory (PIM) unit. 11. A computing device for dynamically coalescing atomic memory operations for memory-local computing, the computing device comprising: a cache including a plurality of cache lines; and cache logic configured to: determine whether a first atomic memory access and a second atomic memory access are candidates for coalescing; and coalesce the first atomic memory access and the second atomic memory access in a cache line allocated in an atomic coalescing state prior to requesting memory-local processing by a memory-local compute unit. 12. The computing device of claim 11 , wherein the cache logic is further configured to: determine that the first atomic memory access is a candidate for coalescing; store an operand of the first atomic memory access in the cache line at a location targeted by the first atomic memory access; and wherein determining whether the first atomic memory access and a second atomic memory access are candidates for coalescing includes: determining that the second atomic memory access is a candidate for coalescing with the first atomic memory access based on the operand of the first atomic memory access and an operand of the second atomic memory access. 13. The computing device of claim 12 , wherein coalescing the first atomic memory access and the second atomic memory access in the cache line prior to requesting memory-local processing by the memory-local compute unit includes: coalescing the first atomic memory access and the second atomic memory access by performing an operation of the second atomic memory access, using an operand of the second atomic memory access, on data at a location in the cache line targeted by the second atomic memory access. 14. The computing device of claim 13 , wherein the cache logic is further configured to: send, to a memory controller in response to a triggering event, one or more memory-local processing commands for the first atomic memory access and the second atomic memory access. 15. The computing device of claim 11 , wherein determining whether the first atomic memory access and the second atomic memory access are candidates for coalescing includes: determining whether the first atomic memory access can be coalesced with the second atomic memory access based on a symmetric access to different memory modules. 16. The computing device of claim 15 , wherein determining whether the first atomic memory access can be coalesced with the second atomic memory access based on a symmetric access to different memory modules includes: determining, in response to a triggering event, whether a first cache line that includes the first atomic memory access can be coalesced with a second cache line that includes the second atomic memory access, wherein the first cache line and the second cache line are in the atomic coalescing state. 17. The computing device of claim 16 , wherein coalescing the first atomic memory access and the second atomic memory access in the cache line prior to requesting memory-local processing by the memory-local compute unit includes: coalescing the first atomic memory access and the second atomic memory access using a multi-module memory-local processing command. 18. A system for dynamically coalescing atomic memory operations for memory-local computing, the system comprising: a memory device including at least one memory-local compute unit; one or more processor cores configured to issue memory access requests; and a cache coupled to the one or more processor cores, the cache including a plurality of cache lines and cache logic, wherein the cache logic is configured to: determine whether a first atomic memory access and a second atomic memory access are candidates for coalescing; and coalesce the first atomic memory access and the second atomic memory access in a cache l
Allocation or management of cache space · CPC title
by multiple requestors · CPC title
Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory · CPC title
Cache with multiple tag or data arrays being simultaneously accessible · CPC title
Burst mode · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.