Dynamically coalescing atomic memory operations for memory-local computing

US11726918B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11726918-B2
Application numberUS-202117361145-A
CountryUS
Kind codeB2
Filing dateJun 28, 2021
Priority dateJun 28, 2021
Publication dateAug 15, 2023
Grant dateAug 15, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Dynamically coalescing atomic memory operations for memory-local computing is disclosed. In an embodiment, it is determined whether a first atomic memory access and a second atomic memory access are candidates for coalescing. In response to a triggering event, the atomic memory accesses that are candidates for coalescing are coalesced in a cache prior to requesting memory-local processing by a memory-local compute unit. The atomic memory accesses may be coalesced in the same cache line or atomic memory accesses in different cache lines may be coalesced using a multicast memory-local processing command.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of dynamically coalescing atomic memory operations for memory-local computing comprising: determining whether a first atomic memory access and a second atomic memory access are candidates for coalescing; and coalescing the first atomic memory access and the second atomic memory access in a cache line allocated in an atomic coalescing state prior to requesting memory-local processing by a memory-local compute unit. 2. The method of claim 1 further comprising: determining that the first atomic memory access is a candidate for coalescing; allocating a cache line in the atomic coalescing state without loading data from memory; and storing an operand of the first atomic memory access in the cache line at a location targeted by the first atomic memory access; and wherein determining whether the first atomic memory access and the second atomic memory access are candidates for coalescing includes: determining that the second atomic memory access is a candidate for coalescing with the first atomic memory access based on the operand of the first atomic memory access and an operand of the second atomic memory access. 3. The method of claim 2 , wherein coalescing the first atomic memory access and the second atomic memory access in the cache line prior to requesting memory-local processing by a memory-local compute unit includes: coalescing the first atomic memory access and the second atomic memory access by performing an operation of the second atomic memory access, using the operand of the second atomic memory access, on data at a location in the cache line targeted by the second atomic memory access. 4. The method of claim 3 further comprising sending, to a memory controller in response to a triggering event, one or more memory-local processing commands for the first atomic memory access and the second atomic memory access. 5. The method of claim 2 further comprising determining, based on one or more metrics, whether to allocate the cache line in the atomic coalescing state for the first atomic memory access. 6. The method of claim 1 , wherein determining whether the first atomic memory access and the second atomic memory access are candidates for coalescing includes: determining whether the first atomic memory access can be coalesced with the second atomic memory access based on a symmetric access to different memory modules. 7. The method of claim 6 , wherein determining whether the first atomic memory access can be coalesced with the second atomic memory access based on a symmetric access to different memory modules includes: determining, in response to a triggering event, whether a first cache line that includes the first atomic memory access can be coalesced with a second cache line that includes the second atomic memory access, wherein the first cache line and the second cache line are in the atomic coalescing state. 8. The method of claim 7 , wherein determining, in response to a triggering event, whether a first cache line that includes the first atomic memory access can be coalesced with a second cache line that includes the second atomic memory access, wherein the first cache line and the second cache line are in the atomic coalescing state, includes: tracking cache lines that are candidates for coalescing. 9. The method of claim 6 , wherein coalescing the first atomic memory access and the second atomic memory access in the cache line prior to requesting memory-local processing by the memory-local compute unit includes: coalescing the first atomic memory access and the second atomic memory access using a multi-module memory-local processing command. 10. The method of claim 1 , wherein the memory-local compute unit is a processing-in-memory (PIM) unit. 11. A computing device for dynamically coalescing atomic memory operations for memory-local computing, the computing device comprising: a cache including a plurality of cache lines; and cache logic configured to: determine whether a first atomic memory access and a second atomic memory access are candidates for coalescing; and coalesce the first atomic memory access and the second atomic memory access in a cache line allocated in an atomic coalescing state prior to requesting memory-local processing by a memory-local compute unit. 12. The computing device of claim 11 , wherein the cache logic is further configured to: determine that the first atomic memory access is a candidate for coalescing; store an operand of the first atomic memory access in the cache line at a location targeted by the first atomic memory access; and wherein determining whether the first atomic memory access and a second atomic memory access are candidates for coalescing includes: determining that the second atomic memory access is a candidate for coalescing with the first atomic memory access based on the operand of the first atomic memory access and an operand of the second atomic memory access. 13. The computing device of claim 12 , wherein coalescing the first atomic memory access and the second atomic memory access in the cache line prior to requesting memory-local processing by the memory-local compute unit includes: coalescing the first atomic memory access and the second atomic memory access by performing an operation of the second atomic memory access, using an operand of the second atomic memory access, on data at a location in the cache line targeted by the second atomic memory access. 14. The computing device of claim 13 , wherein the cache logic is further configured to: send, to a memory controller in response to a triggering event, one or more memory-local processing commands for the first atomic memory access and the second atomic memory access. 15. The computing device of claim 11 , wherein determining whether the first atomic memory access and the second atomic memory access are candidates for coalescing includes: determining whether the first atomic memory access can be coalesced with the second atomic memory access based on a symmetric access to different memory modules. 16. The computing device of claim 15 , wherein determining whether the first atomic memory access can be coalesced with the second atomic memory access based on a symmetric access to different memory modules includes: determining, in response to a triggering event, whether a first cache line that includes the first atomic memory access can be coalesced with a second cache line that includes the second atomic memory access, wherein the first cache line and the second cache line are in the atomic coalescing state. 17. The computing device of claim 16 , wherein coalescing the first atomic memory access and the second atomic memory access in the cache line prior to requesting memory-local processing by the memory-local compute unit includes: coalescing the first atomic memory access and the second atomic memory access using a multi-module memory-local processing command. 18. A system for dynamically coalescing atomic memory operations for memory-local computing, the system comprising: a memory device including at least one memory-local compute unit; one or more processor cores configured to issue memory access requests; and a cache coupled to the one or more processor cores, the cache including a plurality of cache lines and cache logic, wherein the cache logic is configured to: determine whether a first atomic memory access and a second atomic memory access are candidates for coalescing; and coalesce the first atomic memory access and the second atomic memory access in a cache l

Assignees

Inventors

Classifications

  • Allocation or management of cache space · CPC title

  • by multiple requestors · CPC title

  • Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory · CPC title

  • Cache with multiple tag or data arrays being simultaneously accessible · CPC title

  • Burst mode · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11726918B2 cover?
Dynamically coalescing atomic memory operations for memory-local computing is disclosed. In an embodiment, it is determined whether a first atomic memory access and a second atomic memory access are candidates for coalescing. In response to a triggering event, the atomic memory accesses that are candidates for coalescing are coalesced in a cache prior to requesting memory-local processing by a …
Who is the assignee on this patent?
Advanced Micro Devices Inc
What technology area does this patent fall under?
Primary CPC classification G06F12/0871. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 15 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).