Hardware support for dual-memory atomic operations
US-2020401412-A1 · Dec 24, 2020 · US
US12066941B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12066941-B2 |
| Application number | US-202217961146-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 6, 2022 |
| Priority date | Sep 2, 2020 |
| Publication date | Aug 20, 2024 |
| Grant date | Aug 20, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Described are methods and a system for atomic memory operations with contended cache lines. A processing system includes at least two cores, each core having a local cache, and a lower level cache in communication with each local cache. One local cache configured to request a cache line to execute an atomic memory operation (AMO) instruction, receive the cache line via the lower level cache, receive a probe downgrade due to other local cache requesting the cache line prior to execution of the AMO, and send the AMO instruction to the lower level cache for remote execution in response to the probe downgrade.
Opening claim text (preview).
What is claimed is: 1. A processing system comprising: at least two cores, each core having a local cache; a lower level cache in communication with both local caches; one local cache configured to: request a cache line to execute an atomic memory operation (AMO) instruction; receive the cache line via the lower level cache, wherein the lower level cache is configured to determine availability of the cache line based on input from other caches or memory structures associated with the cache line; receive a probe downgrade due to a remaining local cache requesting the requested cache line prior to execution of the AMO; and send the AMO instruction to the lower level cache for remote execution in response to the probe downgrade. 2. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on inclusive cache presence bits. 3. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on matching a transaction in flight or buffered from another cache. 4. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on whether the lower level cache has the cache line at all. 5. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on whether the lower level cache has the cache line in a Shared or Unique coherence state. 6. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on matching a probe from a later-level cache. 7. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on matching an eviction from the lower level cache. 8. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on a predictor table of recently-accessed cache lines that are likely to be contended. 9. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on bloom filter of cache lines that are not likely to be contended. 10. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on at least a Least Recently Used (LRU) algorithm. 11. The processing system of claim 1 , wherein the lower level cache is further configured to: check with the other caches or memory structures associated with the cache line regarding a willingness to give up the cache line; send a contended cache line message to the one local cache based on a variety of factors; and send the AMO instruction for remote execution in response to the contended cache line message. 12. A processing system comprising: a core with a local cache; a shared cache in communication with the local cache of the core and at least another cache of at least another core; the local cache configured to: request a cache line to execute an atomic memory operation (AMO) instruction; receive a message from the shared cache that the requested cache line is unavailable, wherein the shared cache is configured to determine an availability of the cache line based on input from at least the at least another cache of the at least another core and inclusive cache presence bits; and send the AMO instruction to the shared cache for remote execution in response to the message. 13. The processing system of claim 12 , wherein the shared cache configured to determine an availability of the cache line also based on latency, a Least Recently Used (LRU) algorithm, a predictor table of recently-accessed cache lines that are likely to be contended, and a bloom filter of cache lines that are not likely to be contended. 14. The processing system of claim 13 , wherein the shared cache configured to determine an availability of the cache line also based on input from at least the at least another cache of the at least another core. 15. The processing system of claim 14 , wherein the shared cache configured to determine an availability of the cache line also based on inclusive cache presence bits. 16. The processing system of claim 15 , wherein the shared cache configured to determine an availability of the cache line also based on matching a transaction in flight or buffered from another cache. 17. The processing system of claim 16 , wherein the shared cache configured to determine an availability of the cache line also based on whether the shared cache has the cache line at all. 18. The processing system of claim 17 , wherein the shared cache configured to determine an availability of the cache line also based on whether the shared cache has the cache line in a Shared or Unique coherence state. 19. The processing system of claim 18 , wherein the shared cache configured to determine an availability of the cache line also based on matching a probe from later-level cache and on matching an eviction from the shared cache. 20. A method for executing atomic memory operation (AMO) instructions, the method comprising: requesting, by a local cache, a cache line for an AMO instruction from a lower level memory structure; determining, by the lower level memory structure, availability of a requested cache line based on a Least Recently Used (LRU) algorithm and latency; receiving, by the local cache from the lower level memory structure, the requested cache line from the lower level memory structure when available; receiving a downgrade probe due to another cache request for the received cache line prior to AMO instruction execution, wherein the probe downgrade changes a cache coherency status of the cache line; sending, by the local cache to the lower level memory structure, the AMO instruction for remote execution in response to the probe downgrade; receiving, by the local cache from the lower level memory structure, a contended cache line message from the lower level memory structure when not available; and sending by the local cache to the lower level memory structure, the AMO instruction to the lower level memory structure for remote execution in response to the contended cache line message.
in a single cavity · CPC title
Optical microcavities, e.g. cavity dimensions comparable to the wavelength · CPC title
controlled by temperature · CPC title
for applying modulation to the laser · CPC title
using optoacoustic interaction with the material, e.g. laser radiation, photoacoustics (photoacoustic cells G01N21/1702; measuring characteristics of vibrations by using radiation-sensitive means G01H9/00; acousto-optical conversion techniques for short-range imaging G01S15/8965; sound-producing devices using laser bundle G10K15/046) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.