Method for executing atomic memory operations when contested

US12066941B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12066941-B2
Application numberUS-202217961146-A
CountryUS
Kind codeB2
Filing dateOct 6, 2022
Priority dateSep 2, 2020
Publication dateAug 20, 2024
Grant dateAug 20, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described are methods and a system for atomic memory operations with contended cache lines. A processing system includes at least two cores, each core having a local cache, and a lower level cache in communication with each local cache. One local cache configured to request a cache line to execute an atomic memory operation (AMO) instruction, receive the cache line via the lower level cache, receive a probe downgrade due to other local cache requesting the cache line prior to execution of the AMO, and send the AMO instruction to the lower level cache for remote execution in response to the probe downgrade.

First claim

Opening claim text (preview).

What is claimed is: 1. A processing system comprising: at least two cores, each core having a local cache; a lower level cache in communication with both local caches; one local cache configured to: request a cache line to execute an atomic memory operation (AMO) instruction; receive the cache line via the lower level cache, wherein the lower level cache is configured to determine availability of the cache line based on input from other caches or memory structures associated with the cache line; receive a probe downgrade due to a remaining local cache requesting the requested cache line prior to execution of the AMO; and send the AMO instruction to the lower level cache for remote execution in response to the probe downgrade. 2. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on inclusive cache presence bits. 3. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on matching a transaction in flight or buffered from another cache. 4. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on whether the lower level cache has the cache line at all. 5. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on whether the lower level cache has the cache line in a Shared or Unique coherence state. 6. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on matching a probe from a later-level cache. 7. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on matching an eviction from the lower level cache. 8. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on a predictor table of recently-accessed cache lines that are likely to be contended. 9. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on bloom filter of cache lines that are not likely to be contended. 10. The processing system of claim 1 , wherein the lower level cache configured to determine an availability of the cache line also based on at least a Least Recently Used (LRU) algorithm. 11. The processing system of claim 1 , wherein the lower level cache is further configured to: check with the other caches or memory structures associated with the cache line regarding a willingness to give up the cache line; send a contended cache line message to the one local cache based on a variety of factors; and send the AMO instruction for remote execution in response to the contended cache line message. 12. A processing system comprising: a core with a local cache; a shared cache in communication with the local cache of the core and at least another cache of at least another core; the local cache configured to: request a cache line to execute an atomic memory operation (AMO) instruction; receive a message from the shared cache that the requested cache line is unavailable, wherein the shared cache is configured to determine an availability of the cache line based on input from at least the at least another cache of the at least another core and inclusive cache presence bits; and send the AMO instruction to the shared cache for remote execution in response to the message. 13. The processing system of claim 12 , wherein the shared cache configured to determine an availability of the cache line also based on latency, a Least Recently Used (LRU) algorithm, a predictor table of recently-accessed cache lines that are likely to be contended, and a bloom filter of cache lines that are not likely to be contended. 14. The processing system of claim 13 , wherein the shared cache configured to determine an availability of the cache line also based on input from at least the at least another cache of the at least another core. 15. The processing system of claim 14 , wherein the shared cache configured to determine an availability of the cache line also based on inclusive cache presence bits. 16. The processing system of claim 15 , wherein the shared cache configured to determine an availability of the cache line also based on matching a transaction in flight or buffered from another cache. 17. The processing system of claim 16 , wherein the shared cache configured to determine an availability of the cache line also based on whether the shared cache has the cache line at all. 18. The processing system of claim 17 , wherein the shared cache configured to determine an availability of the cache line also based on whether the shared cache has the cache line in a Shared or Unique coherence state. 19. The processing system of claim 18 , wherein the shared cache configured to determine an availability of the cache line also based on matching a probe from later-level cache and on matching an eviction from the shared cache. 20. A method for executing atomic memory operation (AMO) instructions, the method comprising: requesting, by a local cache, a cache line for an AMO instruction from a lower level memory structure; determining, by the lower level memory structure, availability of a requested cache line based on a Least Recently Used (LRU) algorithm and latency; receiving, by the local cache from the lower level memory structure, the requested cache line from the lower level memory structure when available; receiving a downgrade probe due to another cache request for the received cache line prior to AMO instruction execution, wherein the probe downgrade changes a cache coherency status of the cache line; sending, by the local cache to the lower level memory structure, the AMO instruction for remote execution in response to the probe downgrade; receiving, by the local cache from the lower level memory structure, a contended cache line message from the lower level memory structure when not available; and sending by the local cache to the lower level memory structure, the AMO instruction to the lower level memory structure for remote execution in response to the contended cache line message.

Assignees

Inventors

Classifications

  • in a single cavity · CPC title

  • Optical microcavities, e.g. cavity dimensions comparable to the wavelength · CPC title

  • controlled by temperature · CPC title

  • for applying modulation to the laser · CPC title

  • using optoacoustic interaction with the material, e.g. laser radiation, photoacoustics (photoacoustic cells G01N21/1702; measuring characteristics of vibrations by using radiation-sensitive means G01H9/00; acousto-optical conversion techniques for short-range imaging G01S15/8965; sound-producing devices using laser bundle G10K15/046) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12066941B2 cover?
Described are methods and a system for atomic memory operations with contended cache lines. A processing system includes at least two cores, each core having a local cache, and a lower level cache in communication with each local cache. One local cache configured to request a cache line to execute an atomic memory operation (AMO) instruction, receive the cache line via the lower level cache, re…
Who is the assignee on this patent?
Sifive Inc
What technology area does this patent fall under?
Primary CPC classification G06F12/0811. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 20 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).