Coprocessor Register Renaming
US-2024045680-A1 · Feb 8, 2024 · US
US2016246728A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016246728-A1 |
| Application number | US-201514625956-A |
| Country | US |
| Kind code | A1 |
| Filing date | Feb 19, 2015 |
| Priority date | Feb 19, 2015 |
| Publication date | Aug 25, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are disclosed relating to register caching techniques for thread switches. In one embodiment, an apparatus includes a register file and caching circuitry. In this embodiment, the register file includes a plurality of registers and the caching circuitry is configured to store information that indicates threads that correspond to data stored in respective ones of the plurality of registers. In this embodiment, the apparatus is configured to store, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers. In some embodiments, the disclosed techniques may reduce context switch latency, reduce pressure on a data cache, and/or allow smaller slices of thread execution, for example.
Opening claim text (preview).
What is claimed is: 1 . An apparatus, comprising: a register file comprising a plurality of registers; and caching circuitry configured to store information that indicates threads that correspond to data stored in registers of the plurality of registers; wherein the apparatus is configured to store, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers. 2 . The apparatus of claim 1 , further comprising: a memory, wherein the apparatus is configured to store data from the register file for the first and second threads in respective first and second backing memory regions in the memory based on context switches between threads; wherein the caching circuitry is configured to store the second valid data in the second backing memory region in response to an access of the second register by the first thread. 3 . The apparatus of claim 2 , wherein the caching circuitry is configured to restore the second valid data to the second register from the second backing memory region in response to an access of the second register by the second thread. 4 . The apparatus of claim 2 , wherein, at the point in time, the apparatus is configured to store information specifying that the first valid data is not currently stored in the first backing memory region and that the second valid data is not currently stored in the second backing memory region. 5 . The apparatus of claim 2 , wherein the caching circuitry is configured to store tag information for the second valid data when the second valid data is in the second register, wherein the tag information corresponds to a location in the second backing memory region for the second register; and wherein the caching circuitry is configured to store the second valid data in the location in the second backing memory region in response to the access of the second register by the first thread, wherein the access by the first thread uses a tag that does not match the tag information. 6 . The apparatus of claim 2 , wherein the memory comprises an on-chip memory. 7 . The apparatus of claim 2 , further comprising: a data cache configured to cache data for the register file, wherein the first and second backing memory regions are accessible to the caching circuitry without using the data cache. 8 . The apparatus of claim 1 , wherein the apparatus is configured to execute instructions of the first thread and not instructions of the second thread during a time interval that includes the point in time. 9 . The apparatus of claim 1 , wherein the caching circuitry includes a valid field, a modified field, and a tag field for one or more registers of the plurality of registers. 10 . The apparatus of claim 1 , further comprising: a plurality of different processing elements configured to separately execute threads using a plurality of respective register files; wherein a first one of the plurality of different processing elements that includes the register file is configured, in response to a write of particular data to a register corresponding to the second register in a different register file of the plurality of respective register files by a second one of the plurality of different processing elements, to write the particular data to the second register or to invalidate the second valid data in the second register. 11 . The apparatus of claim 1 , further comprising: a plurality of different processing elements configured to separately execute threads using a plurality of respective register files; wherein a first one of the plurality of different processing elements that includes the register file is configured to store valid, modified data in the register file for a thread that is executing on a second one of the plurality of different processing elements. 12 . A method, comprising: caching circuitry, in an computing system that includes a register file that comprises a plurality of registers, storing information indicating threads that correspond to data stored in respective ones of the plurality of registers; and the computing system storing, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers. 13 . The method of claim 12 , further comprising: storing the second valid data in a backing memory region in a memory in response to an instruction in the first thread that accesses the second register. 14 . The method of claim 13 , further comprising: refraining from storing the second valid data in the backing memory region at least until an instruction from a thread other than the second thread is determined to access the second register. 15 . The method of claim 13 , further comprising: restoring the second valid data to the second register in response to an access of the second register by the second thread. 16 . The method of claim 13 , wherein the storing the second valid data in the backing memory region comprises storing without using a low-level data cache. 17 . The method of claim 12 , further comprising: maintaining validity information, modified information, and tag information for the register file. 18 . An apparatus, comprising: a register file comprising a plurality of registers; caching circuitry configured to store information specifying threads that correspond to data stored in respective ones of the plurality of registers; a memory, wherein the apparatus is configured to store data from the register file for first and second threads in respective first and second backing memory regions in the memory; wherein the apparatus is configured to store, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers; and wherein the apparatus is configured to store the second valid data in the second backing memory region in response to an access of the second register by the first thread. 19 . The apparatus of claim 18 , wherein the second valid data is modified data. 20 . The apparatus of claim 18 , wherein the apparatus is configured to retrieve the second valid data from the backing memory region in response to an access of the second register by the second thread, subsequent to the apparatus storing the second valid data.
according to context, e.g. thread buffers · CPC title
Instruction code · CPC title
Register arrangements · CPC title
with dedicated cache, e.g. instruction or stack · CPC title
controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.