Data value prediction
US-2024370268-A1 · Nov 7, 2024 · US
US9600288B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9600288-B1 |
| Application number | US-201213465372-A |
| Country | US |
| Kind code | B1 |
| Filing date | May 7, 2012 |
| Priority date | Jul 18, 2011 |
| Publication date | Mar 21, 2017 |
| Grant date | Mar 21, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system and method for efficiently accessing operands in a datapath. An apparatus includes a data operand register file and an execution pipeline with multiple stages. In addition, the apparatus includes a result bypass cache configured to store data results conveyed by at least the final stage of the execution pipeline stage. Control logic is included which is configured to determine whether source operands for an instruction entering the pipeline are available in the last stage of the pipeline or in the result bypass cache. If the source operands are available in the last stage of the pipeline or the result bypass cache, they may be obtained from one of those locations rather than reading from the register file. If the source operands are not available from the last stage or the result bypass cache, then they may be obtained from the data operand register file.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: a data operand register file; one or more execution pipeline stages; an operand cache configured to store data results conveyed by the one or more execution pipeline stages; and control logic, wherein the control logic is configured to: select a source for a data operand of an instruction from the operand cache or a given stage of the pipeline, in response to determining the data operand is available in the operand cache or the given stage of the pipeline; select a source for a data operand of an instruction from the data operand register file, in response to determining the data operand is not available in either the operand cache or the given stage of the pipeline; write a result of a producer instruction into the operand cache from the one or more execution pipeline stages when a distance between the producer instruction and a consumer instruction is less than a given distance, wherein the distance is measured as a number of instructions in-program-order between the producer instruction and the consumer instruction; and prevent writing the result into the operand cache and write the result into the data operand register file when the distance is greater than the given distance. 2. The apparatus as recited in claim 1 , wherein the operand cache stores data comprising one or more of result data conveyed by producer instructions and data conveyed from the operand register file prior to the one or more execution pipeline stages. 3. The apparatus as recited in claim 1 , wherein in response to determining a data result is selected to be removed from the operand cache, the control logic is further configured to: prevent the data result from being written into the data operand register file, in response to determining the data result stored in the operand cache is marked with a last-use indication that indicates a last consumer instruction of the data result is within a number of instructions of an instruction that produces the data result; and write the data result into the data operand register file, in response to determining the data result is not marked with the last-use indication. 4. The apparatus as recited in claim 1 , wherein the given distance is based on a size of the operand cache. 5. The apparatus as recited in claim 1 , wherein the control logic is further configured to store a data value read from the register file in the operand cache, in response to detecting a hint that the data value is to be prefetched for a subsequent instruction. 6. The apparatus as recited in claim 1 , wherein said result of the producer instruction is written into the operand cache when the producer instruction is in the last stage of the pipeline. 7. The apparatus as recited in claim 3 , wherein each stage of the one or more execution pipeline stages executes an instruction of a thread different from another thread in any other stage of the one or more execution pipeline stages. 8. The apparatus as recited in claim 1 , wherein the given distance is based on a number of results permitted to be stored in the operand cache for an associated thread of a plurality of threads. 9. A method comprising: storing data results conveyed by one or more execution pipeline stages of a pipeline in an operand cache; selecting a source for a data operand of an instruction from the operand cache or a given stage of the one or more execution pipeline stages of the pipeline, in response to determining the data operand is available in the operand cache or the given stage of the pipeline; selecting a source for a data operand of an instruction from the data operand register file, in response to determining the data operand is not available in either the operand cache or the given stage of the pipeline; writing a result of a producer instruction into the operand cache from the one or more execution pipeline stages when a distance between the producer instruction and a consumer instruction is less than a given distance, wherein the distance is measured as a number of instructions in-program-order between the producer instruction and the consumer instruction; and preventing writing the result into the operand cache and writing the result into the data operand register file when the distance is greater than the given distance. 10. The method as recited in claim 9 , wherein in response to determining a data result is selected to be removed from the operand cache, the method further comprises: preventing the data result from being written into the data operand register file, in response to determining the data result stored in the operand cache is marked with a last-use indication that indicates a last consumer instruction of the data result is within a number of instructions of an instruction that produces the data result; and writing the data result into the data operand register file, in response to determining the data result is not marked with the last-use indication. 11. The method as recited in claim 9 , further comprising storing a data value read from the register file in the operand cache, in response to detecting a hint that the data value is to be prefetched for a subsequent instruction. 12. The method as recited in claim 11 , in response to receiving last-use hint information from a compiler for a third data result, further comprising canceling a write of the first data result to the data operand register file. 13. The method as recited in claim 11 , further comprising storing data results conveyed by only the operand cache to the data operand register file. 14. The method as recited in claim 11 , further comprising executing a maximum number of different threads equal to a number of the one or more execution pipeline stages. 15. The method as recited in claim 14 , further comprising executing single-instruction-multiple-data (SIMD) instructions in the one or more execution pipeline stages. 16. The method as recited in claim 9 , further comprising storing data in the operand cache comprising one or more of result data conveyed by producer instructions and data conveyed from the operand register file prior to the one or more execution pipeline stages. 17. A processor comprising: a first execution core configured to execute general-purpose instructions; a second execution core comprising one or more pipeline stages of a pipeline; a scheduler configured to issue a given instruction either to the first or to the second execution core; wherein the second execution core is configured to: store data results conveyed one or more execution pipeline stages of the pipeline in an operand cache; select a source for a data operand of an instruction from the operand cache or a given stage of the one or more execution pipeline stages, in response to determining the data operand is available in the operand cache or the given stage of the pipeline; and select a source for a data operand of an instruction from a data operand register file, in response to determining the data operand is not available in either the operand cache or the given stage of the pipeline; write a result of a producer instruction into the operand cache from the one or more execution pipeline stages when a distance between the producer instruction and a consumer instruction is less than a given distance, wherein the distance is measured as a number of instructions in-program-order between the producer instruction and the consumer instruction; and prevent writing the result into the operand cache and write the result into the data operand register file when the distance is greater th
Operand accessing · CPC title
Operand prefetching (cache prefetching G06F12/0862) · CPC title
controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title
from multiple instruction streams, e.g. multistreaming · CPC title
controlled by a single instruction for multiple data lanes [SIMD] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.