Result bypass cache

US9600288B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9600288-B1
Application numberUS-201213465372-A
CountryUS
Kind codeB1
Filing dateMay 7, 2012
Priority dateJul 18, 2011
Publication dateMar 21, 2017
Grant dateMar 21, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method for efficiently accessing operands in a datapath. An apparatus includes a data operand register file and an execution pipeline with multiple stages. In addition, the apparatus includes a result bypass cache configured to store data results conveyed by at least the final stage of the execution pipeline stage. Control logic is included which is configured to determine whether source operands for an instruction entering the pipeline are available in the last stage of the pipeline or in the result bypass cache. If the source operands are available in the last stage of the pipeline or the result bypass cache, they may be obtained from one of those locations rather than reading from the register file. If the source operands are not available from the last stage or the result bypass cache, then they may be obtained from the data operand register file.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: a data operand register file; one or more execution pipeline stages; an operand cache configured to store data results conveyed by the one or more execution pipeline stages; and control logic, wherein the control logic is configured to: select a source for a data operand of an instruction from the operand cache or a given stage of the pipeline, in response to determining the data operand is available in the operand cache or the given stage of the pipeline; select a source for a data operand of an instruction from the data operand register file, in response to determining the data operand is not available in either the operand cache or the given stage of the pipeline; write a result of a producer instruction into the operand cache from the one or more execution pipeline stages when a distance between the producer instruction and a consumer instruction is less than a given distance, wherein the distance is measured as a number of instructions in-program-order between the producer instruction and the consumer instruction; and prevent writing the result into the operand cache and write the result into the data operand register file when the distance is greater than the given distance. 2. The apparatus as recited in claim 1 , wherein the operand cache stores data comprising one or more of result data conveyed by producer instructions and data conveyed from the operand register file prior to the one or more execution pipeline stages. 3. The apparatus as recited in claim 1 , wherein in response to determining a data result is selected to be removed from the operand cache, the control logic is further configured to: prevent the data result from being written into the data operand register file, in response to determining the data result stored in the operand cache is marked with a last-use indication that indicates a last consumer instruction of the data result is within a number of instructions of an instruction that produces the data result; and write the data result into the data operand register file, in response to determining the data result is not marked with the last-use indication. 4. The apparatus as recited in claim 1 , wherein the given distance is based on a size of the operand cache. 5. The apparatus as recited in claim 1 , wherein the control logic is further configured to store a data value read from the register file in the operand cache, in response to detecting a hint that the data value is to be prefetched for a subsequent instruction. 6. The apparatus as recited in claim 1 , wherein said result of the producer instruction is written into the operand cache when the producer instruction is in the last stage of the pipeline. 7. The apparatus as recited in claim 3 , wherein each stage of the one or more execution pipeline stages executes an instruction of a thread different from another thread in any other stage of the one or more execution pipeline stages. 8. The apparatus as recited in claim 1 , wherein the given distance is based on a number of results permitted to be stored in the operand cache for an associated thread of a plurality of threads. 9. A method comprising: storing data results conveyed by one or more execution pipeline stages of a pipeline in an operand cache; selecting a source for a data operand of an instruction from the operand cache or a given stage of the one or more execution pipeline stages of the pipeline, in response to determining the data operand is available in the operand cache or the given stage of the pipeline; selecting a source for a data operand of an instruction from the data operand register file, in response to determining the data operand is not available in either the operand cache or the given stage of the pipeline; writing a result of a producer instruction into the operand cache from the one or more execution pipeline stages when a distance between the producer instruction and a consumer instruction is less than a given distance, wherein the distance is measured as a number of instructions in-program-order between the producer instruction and the consumer instruction; and preventing writing the result into the operand cache and writing the result into the data operand register file when the distance is greater than the given distance. 10. The method as recited in claim 9 , wherein in response to determining a data result is selected to be removed from the operand cache, the method further comprises: preventing the data result from being written into the data operand register file, in response to determining the data result stored in the operand cache is marked with a last-use indication that indicates a last consumer instruction of the data result is within a number of instructions of an instruction that produces the data result; and writing the data result into the data operand register file, in response to determining the data result is not marked with the last-use indication. 11. The method as recited in claim 9 , further comprising storing a data value read from the register file in the operand cache, in response to detecting a hint that the data value is to be prefetched for a subsequent instruction. 12. The method as recited in claim 11 , in response to receiving last-use hint information from a compiler for a third data result, further comprising canceling a write of the first data result to the data operand register file. 13. The method as recited in claim 11 , further comprising storing data results conveyed by only the operand cache to the data operand register file. 14. The method as recited in claim 11 , further comprising executing a maximum number of different threads equal to a number of the one or more execution pipeline stages. 15. The method as recited in claim 14 , further comprising executing single-instruction-multiple-data (SIMD) instructions in the one or more execution pipeline stages. 16. The method as recited in claim 9 , further comprising storing data in the operand cache comprising one or more of result data conveyed by producer instructions and data conveyed from the operand register file prior to the one or more execution pipeline stages. 17. A processor comprising: a first execution core configured to execute general-purpose instructions; a second execution core comprising one or more pipeline stages of a pipeline; a scheduler configured to issue a given instruction either to the first or to the second execution core; wherein the second execution core is configured to: store data results conveyed one or more execution pipeline stages of the pipeline in an operand cache; select a source for a data operand of an instruction from the operand cache or a given stage of the one or more execution pipeline stages, in response to determining the data operand is available in the operand cache or the given stage of the pipeline; and select a source for a data operand of an instruction from a data operand register file, in response to determining the data operand is not available in either the operand cache or the given stage of the pipeline; write a result of a producer instruction into the operand cache from the one or more execution pipeline stages when a distance between the producer instruction and a consumer instruction is less than a given distance, wherein the distance is measured as a number of instructions in-program-order between the producer instruction and the consumer instruction; and prevent writing the result into the operand cache and write the result into the data operand register file when the distance is greater th

Assignees

Inventors

Classifications

  • Operand accessing · CPC title

  • G06F9/383Primary

    Operand prefetching (cache prefetching G06F12/0862) · CPC title

  • controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

  • from multiple instruction streams, e.g. multistreaming · CPC title

  • controlled by a single instruction for multiple data lanes [SIMD] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9600288B1 cover?
A system and method for efficiently accessing operands in a datapath. An apparatus includes a data operand register file and an execution pipeline with multiple stages. In addition, the apparatus includes a result bypass cache configured to store data results conveyed by at least the final stage of the execution pipeline stage. Control logic is included which is configured to determine whether …
Who is the assignee on this patent?
Potter Terence M, Olson Timothy A, Blomgren James S, and 4 more
What technology area does this patent fall under?
Primary CPC classification G06F9/383. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).