Memory performance when speculation control is enabled, and instruction therefor
US-2015378915-A1 · Dec 31, 2015 · US
US10409612B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10409612-B2 |
| Application number | US-201514998249-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 26, 2015 |
| Priority date | Feb 2, 2012 |
| Publication date | Sep 10, 2019 |
| Grant date | Sep 10, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus and method is described herein for providing robust speculative code section abort control mechanisms. Hardware is able to track speculative code region abort events, conditions, and/or scenarios, such as an explicit abort instruction, a data conflict, a speculative timer expiration, a disallowed instruction attribute or type, etc. And hardware, firmware, software, or a combination thereof makes an abort determination based on the tracked abort events. As an example, hardware may make an initial abort determination based on one or more predefined events or choose to pass the event information up to a firmware or software handler to make such an abort determination. Upon determining an abort of a speculative code region is to be performed, hardware, firmware, software, or a combination thereof performs the abort, which may include following a fallback path specified by hardware or software. And to enable testing of such a fallback path, in one implementation, hardware provides software a mechanism to always abort speculative code regions.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: a plurality of cores, one or more of the plurality of cores to concurrently execute multiple threads; one or more of the plurality of cores to perform out-of-order execution of instructions of the multiple threads; and one or more of the plurality of cores comprising: instruction fetch circuitry to fetch the instructions of one or more of the multiple threads, instruction decode circuitry to decode the instructions, register renaming circuitry to rename one or more registers within a register file, a data cache to cache data, a translation lookaside buffer to store virtual to physical address translations, a second level cache unit to cache instructions and data, transaction processing circuitry to process a transactional region of instructions including load instructions and store instructions, the transaction processing circuitry to process a transaction end instruction to indicate an end of a transaction execution region and to cause memory transactions to be atomically committed or aborted, wherein the transactional region of instructions is validated and the transactional region of instructions is committed or aborted based on the validation in response to the transaction end instruction, wherein the transaction end instruction is globally ordered and atomic, transaction checkpoint circuitry to store an architectural state responsive to initiation of the transactional region of instructions, transaction status circuitry including a programmable failure indication associated with one or more transactions, a first transaction to fail or proceed based on its associated failure indication, the transaction status circuitry including an abort events register to define a plurality of abort events to be tracked, the abort events register including a bit map of bit positions that each represent a different abort condition, the bit map including an always abort bit that causes all speculative code regions to abort when set, the transaction status circuitry including an enable/disable register to enable/disable tracking of abort events, wherein different levels of access to the enable/disable register are provided based on privilege level, wherein a first privilege level can control a first entry in the enable/disable register and a second privilege level can control a second entry in the enable disable register, wherein when at least one abort event is detected it is sent to a microcode handler in a firmware layer to determine whether to abort the speculative code regions, circuitry to roll back operations performed by the first transaction using the architectural state stored by the transaction checkpoint circuitry responsive to a failure of the first transaction, and lock elision circuitry to cause critical sections of instructions to execute as transactions on multiple threads without acquiring a lock, the lock elision circuitry to cause one or more of the critical sections to be re-executed non-speculatively using one or more locks in response to detecting a transaction failure, the lock elision circuitry including a lock elision buffer including a memory address and a lock value to be stored thereto and used to perform a late lock acquire or subsequent execution. 2. The processor of claim 1 further comprising: commit circuitry to make results generated by the transactional region of instructions globally visible to one or more of the multiple threads including one or more other transactional regions of instructions. 3. The processor of claim 2 wherein the commit circuitry is to make the results globally visible only when no failure indication is detected. 4. The processor of claim 3 wherein the transaction status circuitry comprises a failure register to store the failure indication. 5. A processor comprising: means for executing multiple threads on a plurality of cores; out-of-order instruction execution means of at least one of the plurality of cores to perform out-of-order execution of instructions of the multiple threads; one or more of the plurality of cores comprising: instruction fetch means to fetch the instructions of one or more of the multiple threads, instruction decode means to decode the instructions, register renaming means to rename one or more registers within a register file, data cache means to cache data, translation lookaside buffer means to store virtual to physical address translations, second level cache means to cache instructions and data, and transaction processing means to process a transactional region of instructions including load instructions and store instructions, the transaction processing means to process a transaction end instruction to indicate an end of a transaction execution region and to cause memory transactions to be atomically committed or aborted, wherein the transactional region of instructions is validated and the transactional region of instructions is committed or aborted based on the validation in response to the transaction end instruction, wherein the transaction end instruction is globally ordered and atomic; transaction checkpoint means to store an architectural state responsive to initiation of the transactional region of instructions; transaction status means including a programmable failure indication associated with one or more transactions, a first transaction to fail or proceed based on its associated failure indication, the transaction status means including an abort events register to define a plurality of abort events to be tracked, the abort events register including a bit map of bit positions that each represent a different abort condition, the bit map including an always abort bit that causes all speculative code regions to abort when set, the transaction status means including an enable/disable register to enable/disable tracking of abort events, wherein different levels of access to the enable/disable register are provided based on privilege level, wherein a first privilege level can control a first entry in the enable/disable register and a second privilege level can control a second entry in the enable disable register, wherein when at least one abort event is detected it is sent to a microcode handler in a firmware laver to determine whether to abort the speculative code regions; circuitry to roll back operations performed by the first transaction using the architectural state stored by the transaction checkpoint means responsive to a failure of the first transaction; lock elision means to cause critical sections of instructions to execute as transactions on multiple threads without acquiring a lock, the lock elision means to cause one or more of the critical sections to be re-executed non-speculatively using one or more locks in response to detecting a transaction failure, the lock elision means including a lock elision buffer including a memory address and a lock value to be stored thereto and used to perform a late lock acquire or subsequent execution. 6. The processor of claim 5 further comprising: commit means to make results generated by the transactional region of instructions globally visible to one or more of the multiple threads including one or more other transactional regions of instructions. 7. The processor of claim 6 wherein the commit means is to make the results globally visible only when no failure indication is detected. 8. The processor of claim 7 wherein the transaction status means comprises a failure register to store the failure indication. 9. A method comprising: performing out-of-order execution of instructions for multiple threads on a plurality of cores; fetching instructions of one or more of the multiple threads, decoding the instructions, renaming
Maintaining memory consistency · CPC title
Recovery, e.g. branch miss-prediction, exception handling (error detection or correction G06F11/00) · CPC title
Synchronisation or serialisation instructions · CPC title
Speculative instruction execution · CPC title
on a serial bus, e.g. I2C bus, SPI bus (on daisy chain buses G06F13/4247) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.