Techniques for handling cache coherency traffic for contended semaphores

US11768771B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11768771-B2
Application numberUS-202117547148-A
CountryUS
Kind codeB2
Filing dateDec 9, 2021
Priority dateSep 19, 2016
Publication dateSep 26, 2023
Grant dateSep 26, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by the core in a spin-loop associated with semaphore acquisition has obtained the semaphore in an exclusive state. Upon detecting that a load in a spin-loop has obtained the semaphore in an exclusive state, the core responds to incoming requests for access to the semaphore with negative acknowledgments. This allows the core to maintain the semaphore cache line in an exclusive state, which allows it to acquire the semaphore faster and to avoid transmitting that cache line to other cores unnecessarily.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: in response to a load in a spin-loop executing a load instruction for a contended semaphore, the contended semaphore being released, and a cache line storing the contended semaphore being loaded in an exclusive state, setting, for a first state machine, a first state machine value corresponding to an address storing the contended semaphore and preventing external requests for the cache line from being satisfied; and executing a lock-compare-and-exchange instruction. 2. The method of claim 1 , further comprising determining that an address for the semaphore is considered contended. 3. The method of claim 2 , wherein the address for the semaphore is considered contended due to two or more threads attempting to access the address in a given amount of time. 4. The method of claim 1 , further comprising detecting that the load in the spin-loop executes the load instruction for the contended semaphore by detecting that a non-lock load hits an address storing the contended semaphore. 5. The method of claim 1 , further comprising detecting that the contended semaphore is released by detecting that a cache line associated with an address storing the contended semaphore has been evicted. 6. The method of claim 1 , further comprising setting a second state machine value for the state machine for the address in response to the contended semaphore being released. 7. The method of claim 6 , further comprising setting a third state machine value for the state machine for the address in response to detecting that the cache line is loaded in an exclusive state. 8. The method of claim 7 , wherein preventing external requests for the cache line from being satisfied occurs in response to no out-of-sequence events occurring for the state machine. 9. A system comprising: a processing core including a load/store unit; and a cache, wherein the load/store unit is configured to handle cache coherency traffic for a contended semaphore by: in response to a load in a spin-loop executing a load instruction for a contended semaphore, the contended semaphore being released, and a cache line storing the contended semaphore being loaded in an exclusive state, setting, for a first state machine, a first state machine value corresponding to an address storing the contended semaphore and preventing external requests for the cache line from being satisfied; and executing a lock-compare-and-exchange instruction. 10. The system of claim 9 , wherein the load/store unit is further configured to determine that an address for the semaphore is considered contended. 11. The system of claim 10 , wherein the address for the semaphore is considered contended due to two or more threads attempting to access the address in a given amount of time. 12. The system of claim 9 , further comprising detecting that the load in the spin-loop executes the load instruction for the contended semaphore by detecting that a non-lock load hits an address storing the contended semaphore. 13. The system of claim 9 , further comprising detecting that the contended semaphore is released by detecting that a cache line associated with an address storing the contended semaphore has been evicted. 14. The system of claim 9 , wherein the load/store unit is further configured to set a second state machine value for the state machine for the address in response to the contended semaphore being released. 15. The system of claim 14 , wherein the load/store unit is further configured to set a third state machine value for the state machine for the address in response to detecting that the cache line is loaded in an exclusive state. 16. The system of claim 15 , wherein preventing external requests for the cache line from being satisfied occurs in response to no out-of-sequence events occurring for the state machine. 17. A processor, comprising: a plurality of processing cores coupled together, each processing core including a load/store unit; and a plurality of caches, each cache associated with a respective processing core of the plurality of processing cores, wherein the load/store unit of each processing core of the plurality of processing cores is configured to handle cache coherency traffic for a contended semaphore by: in response to a load in a spin-loop executing a load instruction for a contended semaphore, the contended semaphore being released, and a cache line storing the contended semaphore being loaded in an exclusive state, setting, for a first state machine, a first state machine value corresponding to an address storing the contended semaphore and preventing external requests for the cache line from being satisfied; and executing a lock-compare-and-exchange instruction. 18. The processor of claim 17 , wherein each load/store unit is further configured to determine that an address for the semaphore is considered contended.

Assignees

Inventors

Classifications

  • Multiple simultaneous or quasi-simultaneous cache accessing · CPC title

  • Barrier synchronisation · CPC title

  • Cache consistency protocols · CPC title

  • Cache access modes · CPC title

  • Correctness of operation, e.g. memory ordering · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11768771B2 cover?
The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by…
Who is the assignee on this patent?
Advanced Micro Devices Inc
What technology area does this patent fall under?
Primary CPC classification G06F12/0844. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 26 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).