What technology area does this patent fall under?

Primary CPC classification G06F12/0844. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 26 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Techniques for handling cache coherency traffic for contended semaphores

US11768771B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11768771-B2
Application number	US-202117547148-A
Country	US
Kind code	B2
Filing date	Dec 9, 2021
Priority date	Sep 19, 2016
Publication date	Sep 26, 2023
Grant date	Sep 26, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by the core in a spin-loop associated with semaphore acquisition has obtained the semaphore in an exclusive state. Upon detecting that a load in a spin-loop has obtained the semaphore in an exclusive state, the core responds to incoming requests for access to the semaphore with negative acknowledgments. This allows the core to maintain the semaphore cache line in an exclusive state, which allows it to acquire the semaphore faster and to avoid transmitting that cache line to other cores unnecessarily.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: in response to a load in a spin-loop executing a load instruction for a contended semaphore, the contended semaphore being released, and a cache line storing the contended semaphore being loaded in an exclusive state, setting, for a first state machine, a first state machine value corresponding to an address storing the contended semaphore and preventing external requests for the cache line from being satisfied; and executing a lock-compare-and-exchange instruction. 2. The method of claim 1 , further comprising determining that an address for the semaphore is considered contended. 3. The method of claim 2 , wherein the address for the semaphore is considered contended due to two or more threads attempting to access the address in a given amount of time. 4. The method of claim 1 , further comprising detecting that the load in the spin-loop executes the load instruction for the contended semaphore by detecting that a non-lock load hits an address storing the contended semaphore. 5. The method of claim 1 , further comprising detecting that the contended semaphore is released by detecting that a cache line associated with an address storing the contended semaphore has been evicted. 6. The method of claim 1 , further comprising setting a second state machine value for the state machine for the address in response to the contended semaphore being released. 7. The method of claim 6 , further comprising setting a third state machine value for the state machine for the address in response to detecting that the cache line is loaded in an exclusive state. 8. The method of claim 7 , wherein preventing external requests for the cache line from being satisfied occurs in response to no out-of-sequence events occurring for the state machine. 9. A system comprising: a processing core including a load/store unit; and a cache, wherein the load/store unit is configured to handle cache coherency traffic for a contended semaphore by: in response to a load in a spin-loop executing a load instruction for a contended semaphore, the contended semaphore being released, and a cache line storing the contended semaphore being loaded in an exclusive state, setting, for a first state machine, a first state machine value corresponding to an address storing the contended semaphore and preventing external requests for the cache line from being satisfied; and executing a lock-compare-and-exchange instruction. 10. The system of claim 9 , wherein the load/store unit is further configured to determine that an address for the semaphore is considered contended. 11. The system of claim 10 , wherein the address for the semaphore is considered contended due to two or more threads attempting to access the address in a given amount of time. 12. The system of claim 9 , further comprising detecting that the load in the spin-loop executes the load instruction for the contended semaphore by detecting that a non-lock load hits an address storing the contended semaphore. 13. The system of claim 9 , further comprising detecting that the contended semaphore is released by detecting that a cache line associated with an address storing the contended semaphore has been evicted. 14. The system of claim 9 , wherein the load/store unit is further configured to set a second state machine value for the state machine for the address in response to the contended semaphore being released. 15. The system of claim 14 , wherein the load/store unit is further configured to set a third state machine value for the state machine for the address in response to detecting that the cache line is loaded in an exclusive state. 16. The system of claim 15 , wherein preventing external requests for the cache line from being satisfied occurs in response to no out-of-sequence events occurring for the state machine. 17. A processor, comprising: a plurality of processing cores coupled together, each processing core including a load/store unit; and a plurality of caches, each cache associated with a respective processing core of the plurality of processing cores, wherein the load/store unit of each processing core of the plurality of processing cores is configured to handle cache coherency traffic for a contended semaphore by: in response to a load in a spin-loop executing a load instruction for a contended semaphore, the contended semaphore being released, and a cache line storing the contended semaphore being loaded in an exclusive state, setting, for a first state machine, a first state machine value corresponding to an address storing the contended semaphore and preventing external requests for the cache line from being satisfied; and executing a lock-compare-and-exchange instruction. 18. The processor of claim 17 , wherein each load/store unit is further configured to determine that an address for the semaphore is considered contended.

Assignees

Advanced Micro Devices Inc

Inventors

Classifications

G06F12/0844Primary
Multiple simultaneous or quasi-simultaneous cache accessing · CPC title
G06F9/522
Barrier synchronisation · CPC title
G06F12/0815
Cache consistency protocols · CPC title
G06F12/0877
Cache access modes · CPC title
G06F2212/1008
Correctness of operation, e.g. memory ordering · CPC title

Patent family

Related publications grouped by family.

View patent family 61621111

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11768771B2 cover?: The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by…
Who is the assignee on this patent?: Advanced Micro Devices Inc
What technology area does this patent fall under?: Primary CPC classification G06F12/0844. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 26 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).