Memory performance when speculation control is enabled, and instruction therefor
US-2015378915-A1 · Dec 31, 2015 · US
US9830158B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9830158-B2 |
| Application number | US-201113289643-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 4, 2011 |
| Priority date | Nov 4, 2011 |
| Publication date | Nov 28, 2017 |
| Grant date | Nov 28, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One embodiment of the present invention sets forth a technique for speculatively issuing instructions to allow a processing pipeline to continue to process some instructions during rollback of other instructions. A scheduler circuit issues instructions for execution assuming that, several cycles later, when the instructions reach multithreaded execution units, that dependencies between the instructions will be resolved, resources will be available, operand data will be available, and other conditions will not prevent execution of the instructions. When a rollback condition exists at the point of execution for an instruction for a particular thread group, the instruction is not dispatched to the multithreaded execution units. However, other instructions issued by the scheduler circuit for execution by different thread groups, and for which a rollback condition does not exist, are executed by the multithreaded execution units. The instruction incurring the rollback condition is reissued after the rollback condition no longer exists.
Opening claim text (preview).
The invention claimed is: 1. A method of performing rollback of speculatively issued instructions, the method comprising: issuing a first set of instructions for at least a first portion of threads in a first thread group that comprises a plurality of threads concurrently executing within a processing core; issuing a second set of instructions for at least a second portion of threads in the first thread group or for a second thread group that also comprises a plurality of threads concurrently executing within the processing core; detecting, by a dispatcher, a first rollback condition for at least one thread included in the at least a first portion of the first thread group during pre-execution processing of an instruction in the first set of instructions; transmitting, by the dispatcher to a scheduler, a rollback code identifying a cause of the first rollback condition; in response to receiving the rollback code, stopping, by the scheduler, issuing of additional instructions for the at least a first portion of threads in the first thread group; discarding in-flight instructions that have issued and have not yet begun executing as part of the first set of instructions; and while discarding the in-flight instructions, executing the second set of instructions. 2. The method of claim 1 , further comprising: issuing a third set of instructions for a third thread group that also comprises a plurality of threads concurrently executing within the processing core; detecting a partial rollback condition for at least one thread in the third thread group during pre-execution processing of a first instruction in the third set of instructions; storing a partial rollback active mask indicating a first portion of threads in the third thread group that diverge for the first instruction; and executing the first instruction for a second portion of the threads in the third thread group that do not diverge for the first instruction. 3. The method of claim 2 , further comprising, reissuing the first instruction for the third thread group; and executing the first instruction for the first portion of threads in the third thread group based on the partial rollback active mask. 4. The method of claim 1 , wherein the instruction in the first set of instructions specifies an invalid super-scalar-pair of two operations that cannot be performed in parallel. 5. The method of claim 4 , further comprising: issuing a first operation of the invalid super-scalar-pair as a first instruction for the at least a first portion of threads in the first thread group; and issuing a second operation of the invalid super-scalar-pair as a second instruction for the at least a first portion of threads in the first thread group. 6. The method of claim 1 , wherein the instruction in the first set of instructions is a barrier synchronization instruction configured to synchronize the at least a first portion of threads in the first thread group with the at least a second portion of threads in the first thread group or the second thread group. 7. The method of claim 1 , wherein the instruction in the first set of instructions specifies an operand that is corrupted. 8. The method of claim 1 , further comprising: determining that the rollback condition is removed; and reissuing the first set of instructions for the at least a first portion of threads in the first thread group. 9. The method of claim 1 , wherein the in-flight instructions are discarded after completing the pre-execution processing. 10. The method of claim 1 , further comprising: determining that the rollback condition is removed; and reissuing the first set of instructions for the at least a first portion of threads in the first thread group before all of the in-flight instructions are discarded. 11. A system for scheduling compute tasks for execution, the system comprising: a memory that stores a first set of instructions for at least a first portion of threads in a first thread group and a second set of instructions for at least a second portion of threads in the first thread group or for a second thread group; a scheduler that: issues the first set of instructions for the at least a first portion of threads in the first thread group that comprises a plurality of threads concurrently executing within a processing core; issues the second set of instructions for the at least a second portion of threads in the first thread group or for the second thread group that also comprises a plurality of threads concurrently executing within the processing core; and in response to receiving a rollback code from a dispatcher, stops issuing additional instructions for the at least a first portion of threads in the first thread group when a first rollback condition is detected; the dispatcher that: detects the first rollback condition for at least one thread included in the at least a first portion of threads in the first thread group during pre-execution processing of an instruction in the first set of instructions; and transmits, to the scheduler, the rollback code identifying a cause of the first rollback condition; discards in-flight instructions that have issued and have not yet begun executing as part of the first set of instructions; and multiple execution units within the processing core that, while discarding the in-flight instructions, execute the second set of instructions. 12. The system of claim 11 , wherein the scheduler further: issues a third set of instructions for a third thread group, the dispatcher further: detects a partial rollback condition for at least one thread in the third thread group during pre-execution processing of a first instruction in the third set of instructions; and stores a partial rollback active mask indicating a first portion of threads in the third thread group that diverge for the first instruction, and the multiple execution units: execute the first instruction for a second portion of the threads in the third thread group that do not diverge for the first instruction. 13. The system of claim 12 , wherein the scheduler reissues the first instruction for the third thread group and the multiple execution units execute the first instruction for the first portion of threads in the third thread group based on the partial rollback active mask. 14. The system of claim 11 , wherein the instruction in the first set of instructions specifies an invalid super-scalar-pair of two operations that cannot be performed in parallel. 15. The system of claim 14 , wherein the scheduler: issues a first operation of the invalid super-scalar-pair as a first instruction for the at least a first portion of threads in the first thread group; and issues a second operation of the invalid super-scalar-pair as a second instruction for the at least a first portion of threads in the first thread group. 16. The system of claim 11 , wherein the instruction in the first set of instructions is a barrier synchronization instruction configured to synchronize the at least a first portion of threads in the first thread group with the at least a second portion of threads in the first thread group or the second thread group. 17. The system of claim 11 , wherein the instruction in the first set of instructions specifies an operand that is corrupted. 18. The system of claim 11 , wherein the scheduler reissues the first set of instructions for the at least a first portion of threads in the first thread group after the rollback condition is removed. 19. The system of claim 11 , wherein th
Recovery, e.g. branch miss-prediction, exception handling (error detection or correction G06F11/00) · CPC title
from multiple instruction streams, e.g. multistreaming · CPC title
Speculative instruction execution · CPC title
controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title
Divergence aspects · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.