Method for executing atomic memory operations when contested

US11467962B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11467962-B2
Application numberUS-202017009876-A
CountryUS
Kind codeB2
Filing dateSep 2, 2020
Priority dateSep 2, 2020
Publication dateOct 11, 2022
Grant dateOct 11, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described are methods and a system for atomic memory operations with contended cache lines. A processing system includes at least two cores, each core having a local cache, and a lower level cache in communication with each local cache. One local cache configured to request a cache line to execute an atomic memory operation (AMO) instruction, receive the cache line via the lower level cache, receive a probe downgrade due to other local cache requesting the cache line prior to execution of the AMO, and send the AMO instruction to the lower level cache for remote execution in response to the probe downgrade.

First claim

Opening claim text (preview).

What is claimed is: 1. A processing system comprising: at least two cores, each core having a local cache; a lower level cache in communication with each local cache; one local cache configured to: request a cache line to execute an atomic memory operation (AMO) instruction; receive the cache line via the lower level cache, wherein the lower level cache is configured to determine availability of the cache line based on a variety of factors including at least a Least Recently Used (LRU) algorithm, latency, input from other caches or memory structures associated with the cache line, inclusive cache presence bits, matching a transaction in flight or buffered from another cache, whether the lower level cache has the cache line at all, whether the lower level cache has the cache line in a Shared or Unique coherence state, matching a probe from a later-level cache, matching an eviction from a lower level cache, a predictor table of recently-accessed cache lines that are likely to be contended, and a bloom filter of cache lines that are not likely to be contended; receive a probe downgrade due to other local cache requesting the cache line prior to execution of the AMO, wherein the probe downgrade changes a cache coherency status of the cache line; and send the AMO instruction to the lower level cache for remote execution in response to the probe downgrade. 2. The processing system of claim 1 , wherein the request is for the cache line in an event of a cache miss at the one local cache. 3. The processing system of claim 1 , wherein the request is for a cache coherence state upgrade in an event of a cache hit at the one local cache. 4. The processing system of claim 1 , wherein the lower level cache is further configured to: check with other caches or memory structures associated with the cache line regarding a willingness to give up the cache line. 5. The processing system of claim 1 , wherein the lower level cache is further configured to: send a contended cache line message to the one local cache based on the variety of factors. 6. The processing system of claim 5 , wherein the one local cache is further configured to: send the AMO instruction to the lower level cache for remote execution in response to the contended cache line message. 7. A processing system comprising: a core with a local cache; a shared cache in communication with the local cache of the core and at least another cache of at least another core; the local cache configured to: request a cache line to execute an atomic memory operation (AMO) instruction; receive a message from the shared cache that the cache line is unavailable, wherein the shared cache line is configured to determine an availability of the cache line based on a variety of factors including at least a Least Recently Used (LRU) algorithm, latency, input from at least the at least another cache of at least another core, and inclusive cache presence bits, matching a transaction in flight or buffered from another cache, whether the shared cache has the cache line at all, whether the shared cache has the cache line in a Shared or Unique coherence state, matching a probe from later-level cache, and matching an eviction from the shared cache; and send the AMO instruction to the shared cache for remote execution in response to the message. 8. The processing system of claim 7 , wherein the request is for the cache line in an event of a cache miss at the local cache. 9. The processing system of claim 7 , wherein the request is for a cache coherence state upgrade in an event of a cache hit at the local cache. 10. The processing system of claim 7 , wherein the shared cache is further configured to: check with the at least another cache of the at least another core regarding willingness to give up the cache line. 11. A method for executing atomic memory operation (AMO) instructions, the method comprising: requesting, by a local cache, a cache line for an AMO instruction from a lower level memory structure; determining, by the lower level memory structure, availability of a requested cache line based on a variety of factors including at least a Least Recently Used (LRU) algorithm, latency, input from other caches or memory structures associated with the cache line, inclusive cache presence bits, matching a transaction in flight or buffered from another cache, whether the lower level memory structure has the cache line at all, whether the lower level memory structure has the cache line in a Shared or Unique coherence state, matching a probe from a later-level memory structure, matching an eviction from the lower level memory structure; receiving, by the local cache from the lower level memory structure, the cache line from the lower level memory structure when available; receiving a downgrade probe due to another cache request for the cache line prior to AMO instruction execution, wherein the probe downgrade changes a cache coherency status of the cache line; sending, by the local cache to the lower level memory structure, the AMO instruction for remote execution in response to the probe downgrade; receiving, by the local cache from the lower level memory structure, a contended cache line message from the lower level memory structure when not available; and sending by the local cache to the lower level memory structure, the AMO instruction to the lower level memory structure for remote execution in response to the contended cache line message. 12. The method of claim 11 , wherein the request is for the cache line in an event of a cache miss at the local cache. 13. The method of claim 11 , wherein the request is for a cache coherence state upgrade in an event of a cache hit at the local cache. 14. The method of claim 11 , the method further comprising: checking with other caches or memory structures associated with the cache line regarding willingness to give up the cache line. 15. The method of claim 11 , the method further comprising: acknowledging, by the local cache, the downgrade probe.

Assignees

Inventors

Classifications

  • with two or more cache hierarchy levels (with multilevel cache hierarchies G06F12/0811) · CPC title

  • Cache with multiple tag or data arrays being simultaneously accessible · CPC title

  • with multilevel cache hierarchies · CPC title

  • with a network or matrix configuration · CPC title

  • using clearing, invalidating or resetting means · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11467962B2 cover?
Described are methods and a system for atomic memory operations with contended cache lines. A processing system includes at least two cores, each core having a local cache, and a lower level cache in communication with each local cache. One local cache configured to request a cache line to execute an atomic memory operation (AMO) instruction, receive the cache line via the lower level cache, re…
Who is the assignee on this patent?
Sifive Inc
What technology area does this patent fall under?
Primary CPC classification G06F12/0811. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).