Data cache with hybrid writeback and writethrough
US-11023375-B1 · Jun 1, 2021 · US
US11467962B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11467962-B2 |
| Application number | US-202017009876-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 2, 2020 |
| Priority date | Sep 2, 2020 |
| Publication date | Oct 11, 2022 |
| Grant date | Oct 11, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Described are methods and a system for atomic memory operations with contended cache lines. A processing system includes at least two cores, each core having a local cache, and a lower level cache in communication with each local cache. One local cache configured to request a cache line to execute an atomic memory operation (AMO) instruction, receive the cache line via the lower level cache, receive a probe downgrade due to other local cache requesting the cache line prior to execution of the AMO, and send the AMO instruction to the lower level cache for remote execution in response to the probe downgrade.
Opening claim text (preview).
What is claimed is: 1. A processing system comprising: at least two cores, each core having a local cache; a lower level cache in communication with each local cache; one local cache configured to: request a cache line to execute an atomic memory operation (AMO) instruction; receive the cache line via the lower level cache, wherein the lower level cache is configured to determine availability of the cache line based on a variety of factors including at least a Least Recently Used (LRU) algorithm, latency, input from other caches or memory structures associated with the cache line, inclusive cache presence bits, matching a transaction in flight or buffered from another cache, whether the lower level cache has the cache line at all, whether the lower level cache has the cache line in a Shared or Unique coherence state, matching a probe from a later-level cache, matching an eviction from a lower level cache, a predictor table of recently-accessed cache lines that are likely to be contended, and a bloom filter of cache lines that are not likely to be contended; receive a probe downgrade due to other local cache requesting the cache line prior to execution of the AMO, wherein the probe downgrade changes a cache coherency status of the cache line; and send the AMO instruction to the lower level cache for remote execution in response to the probe downgrade. 2. The processing system of claim 1 , wherein the request is for the cache line in an event of a cache miss at the one local cache. 3. The processing system of claim 1 , wherein the request is for a cache coherence state upgrade in an event of a cache hit at the one local cache. 4. The processing system of claim 1 , wherein the lower level cache is further configured to: check with other caches or memory structures associated with the cache line regarding a willingness to give up the cache line. 5. The processing system of claim 1 , wherein the lower level cache is further configured to: send a contended cache line message to the one local cache based on the variety of factors. 6. The processing system of claim 5 , wherein the one local cache is further configured to: send the AMO instruction to the lower level cache for remote execution in response to the contended cache line message. 7. A processing system comprising: a core with a local cache; a shared cache in communication with the local cache of the core and at least another cache of at least another core; the local cache configured to: request a cache line to execute an atomic memory operation (AMO) instruction; receive a message from the shared cache that the cache line is unavailable, wherein the shared cache line is configured to determine an availability of the cache line based on a variety of factors including at least a Least Recently Used (LRU) algorithm, latency, input from at least the at least another cache of at least another core, and inclusive cache presence bits, matching a transaction in flight or buffered from another cache, whether the shared cache has the cache line at all, whether the shared cache has the cache line in a Shared or Unique coherence state, matching a probe from later-level cache, and matching an eviction from the shared cache; and send the AMO instruction to the shared cache for remote execution in response to the message. 8. The processing system of claim 7 , wherein the request is for the cache line in an event of a cache miss at the local cache. 9. The processing system of claim 7 , wherein the request is for a cache coherence state upgrade in an event of a cache hit at the local cache. 10. The processing system of claim 7 , wherein the shared cache is further configured to: check with the at least another cache of the at least another core regarding willingness to give up the cache line. 11. A method for executing atomic memory operation (AMO) instructions, the method comprising: requesting, by a local cache, a cache line for an AMO instruction from a lower level memory structure; determining, by the lower level memory structure, availability of a requested cache line based on a variety of factors including at least a Least Recently Used (LRU) algorithm, latency, input from other caches or memory structures associated with the cache line, inclusive cache presence bits, matching a transaction in flight or buffered from another cache, whether the lower level memory structure has the cache line at all, whether the lower level memory structure has the cache line in a Shared or Unique coherence state, matching a probe from a later-level memory structure, matching an eviction from the lower level memory structure; receiving, by the local cache from the lower level memory structure, the cache line from the lower level memory structure when available; receiving a downgrade probe due to another cache request for the cache line prior to AMO instruction execution, wherein the probe downgrade changes a cache coherency status of the cache line; sending, by the local cache to the lower level memory structure, the AMO instruction for remote execution in response to the probe downgrade; receiving, by the local cache from the lower level memory structure, a contended cache line message from the lower level memory structure when not available; and sending by the local cache to the lower level memory structure, the AMO instruction to the lower level memory structure for remote execution in response to the contended cache line message. 12. The method of claim 11 , wherein the request is for the cache line in an event of a cache miss at the local cache. 13. The method of claim 11 , wherein the request is for a cache coherence state upgrade in an event of a cache hit at the local cache. 14. The method of claim 11 , the method further comprising: checking with other caches or memory structures associated with the cache line regarding willingness to give up the cache line. 15. The method of claim 11 , the method further comprising: acknowledging, by the local cache, the downgrade probe.
with two or more cache hierarchy levels (with multilevel cache hierarchies G06F12/0811) · CPC title
Cache with multiple tag or data arrays being simultaneously accessible · CPC title
with multilevel cache hierarchies · CPC title
with a network or matrix configuration · CPC title
using clearing, invalidating or resetting means · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.