Utilization of register checkpointing mechanism with pointer swapping to resolve multithreading mis-speculations
US-9940138-B2 · Apr 10, 2018 · US
US10095637B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10095637-B2 |
| Application number | US-201615267094-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 15, 2016 |
| Priority date | Sep 15, 2016 |
| Publication date | Oct 9, 2018 |
| Grant date | Oct 9, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for improving execution of a lock instruction are provided herein. A lock instruction and younger instructions are allowed to speculatively retire prior to the store portion of the lock instruction committing its value to memory. These instructions thus do not have to wait for the lock instruction to complete before retiring. In the event that the processor detects a violation of the atomic or fencing properties of the lock instruction prior to committing the value of the lock instruction, the processor rolls back state and executes the lock instruction in a slow mode in which younger instructions are not allowed to retire until the stored value of the lock instruction is committed. Speculative retirement of these instructions results in increased processing speed, as instructions no longer need to wait to retire after execution of a lock instruction.
Opening claim text (preview).
What is claimed is: 1. A method for speculatively retiring instructions younger than a lock instruction in a core of a processor, the method comprising: retiring the lock instruction, wherein the lock instruction remains pending after retirement until a store portion of the lock instruction is deemed to be committed to memory; saving a checkpoint state of the processor responsive to retiring the lock instruction, the checkpoint state including data for rolling back state of the processor in the event of violation of an atomic property of the lock instruction or a fencing property of the lock instruction; speculatively retiring an instruction younger than the lock instruction after retiring the lock instruction but before the store portion of the lock instruction is committed to memory; detecting violation of one of the atomic property or the fencing property of the lock instruction by detecting, from a thread other than the thread executing the lock instruction, a request to allow a store to occur to either the memory address specified by the lock instruction or to the memory address specified by a completed load instruction from the same thread as the lock instruction that is younger than the lock instruction; and responsive to detecting the violation, rolling back state of the processor via the checkpoint state. 2. The method of claim 1 , wherein detecting the violation comprises: detecting violation of the atomic property prior to the store portion of the lock instruction being deemed to be committed to memory, by detecting a request to store to a memory address associated with the lock instruction. 3. The method of claim 1 , wherein detecting the violation comprises: detecting violation of the fencing property prior to the store portion of the lock instruction being deemed to be committed to memory, by detecting a request to store to a memory address associated with a load instruction younger than the lock instruction. 4. The method of claim 1 , wherein rolling back state of the processor via the checkpoint state comprises: restoring architectural register values based on the checkpoint state; restoring a program counter of the processor based on the checkpoint state; and canceling pending instructions younger than the lock instruction. 5. The method of claim 1 , further comprising: executing the lock instruction in a slow mode, wherein in the slow mode, the lock instruction is not executed until stores older than the lock instruction have had values committed to memory, and wherein in the slow mode, instructions younger than the lock instruction are not permitted to execute until the store portion of the lock instruction is deemed to be committed to memory. 6. The method of claim 1 , further comprising: retiring a second lock instruction, wherein the second lock instruction remains pending after retirement until a store portion of the second lock instruction is deemed to be committed to memory; saving a second checkpoint state of the processor responsive to retiring the second lock instruction, the second checkpoint state including data for rolling back state of the processor in the event of a violation of an atomic property of the second lock instruction or a fencing property of the second lock instruction; speculatively retiring an instruction younger than the second lock instruction after retiring the second lock instruction but before the store portion of the second lock instruction is committed to memory; detecting no violation of the atomic property or the fencing property prior to the store portion of the lock instruction being deemed to be committed to memory; and responsive to detecting no violation, releasing the checkpoint state and removing post-retirement entries from a load ordering queue. 7. The method of claim 1 , wherein the checkpoint state includes values of architectural registers and flags of the processor. 8. The method of claim 1 , further comprising: preventing retirement of instructions that write to registers not saved by the checkpoint state prior to the store portion of the lock instruction being deemed to be committed to memory. 9. A processor for speculatively retiring instructions younger than a lock instruction, the processor comprising: a retire unit configured to retire the lock instruction, wherein the lock instruction remains pending after retirement until a store portion of the lock instruction is committed to memory; a checkpoint unit configured to save a checkpoint state of the processor responsive to the lock instruction retiring, the checkpoint state including data for rolling back state of the processor in the event of violation of an atomic property of the lock instruction or a fencing property of the lock instruction; and a load/store unit, wherein the retire unit is also configured to speculatively retire an instruction younger than the lock instruction after retiring the lock instruction but before the store portion of the lock instruction is deemed to be committed to memory, and wherein the load/store unit is configured to: detect violation of one of the atomic property or the fencing property of the lock instruction by detecting, from a thread other than the thread executing the lock instruction, a request to allow a store to occur to either the memory address specified by the lock instruction or to the memory address specified by a completed load instruction from the same thread as the lock instruction that is younger than the lock instruction, and responsive to detecting the violation, roll back state of the processor via the checkpoint state. 10. The processor of claim 9 , wherein the load/store unit is configured to detect the violation by: detecting violation of the atomic property prior to the store portion of the lock instruction being deemed to be committed to memory, by detecting a request to store to a memory address associated with the lock instruction. 11. The processor of claim 9 , wherein the load/store unit is configured to detect the violation by: detecting violation of the fencing property prior to the store portion of the lock instruction being deemed to be committed to memory, by detecting a request to store to a memory address associated with a load instruction younger than the lock instruction. 12. The processor of claim 9 , wherein rolling back state of the processor via the checkpoint state comprises: restoring architectural register values based on the checkpoint state; restoring a program counter of the processor based on the checkpoint state; and canceling pending instructions younger than the lock instruction. 13. The processor of claim 9 , wherein: the load/store unit is configured to cause at least one of the load/store unit and functional units of the processor to execute the lock instruction in a slow mode, wherein in the slow mode, the lock instruction is not executed until stores older than the lock instruction have had values committed to memory, and wherein in the slow mode, instructions younger than the lock instruction are not permitted to execute until the store portion of the lock instruction is deemed to be committed to memory. 14. The processor of claim 9 , wherein: the retire unit is further configured to retire a second lock instruction, wherein the second lock instruction remains pending after retirement until a store portion of the second lock instruction is deemed to be committed to memory; the checkpoint unit is further configured to save a second checkpoint state of the processor responsive to retiring the second lock instruction, the second checkpoint state including data for rolling
by using speculative mechanisms · CPC title
Key-lock mechanism · CPC title
using multiple copies of the architectural state, e.g. shadow registers · CPC title
LOAD or STORE instructions; Clear instruction · CPC title
Speculative instruction execution · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.