Register caching techniques for thread switches

US9817664B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9817664-B2
Application numberUS-201514625956-A
CountryUS
Kind codeB2
Filing dateFeb 19, 2015
Priority dateFeb 19, 2015
Publication dateNov 14, 2017
Grant dateNov 14, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are disclosed relating to register caching techniques for thread switches. In one embodiment, an apparatus includes a register file and caching circuitry. In this embodiment, the register file includes a plurality of registers and the caching circuitry is configured to store information that indicates threads that correspond to data stored in respective ones of the plurality of registers. In this embodiment, the apparatus is configured to store, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers. In some embodiments, the disclosed techniques may reduce context switch latency, reduce pressure on a data cache, and/or allow smaller slices of thread execution, for example.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus, comprising: a first register file comprising a plurality of registers; caching circuitry configured to store information that indicates threads that correspond to data stored in registers of the plurality of registers; and a memory, wherein the apparatus is configured to use first and second backing memory regions in the memory for respective first and second threads; wherein the apparatus is configured to store, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers; and wherein the apparatus is configured to, based on information stored in the caching circuitry, store the second valid data in the second backing memory region in response to an access of the second register by the first thread; and wherein the apparatus is configured to: migrate the first thread from a first processor core that includes the first register file for execution on a second, different processor core that includes a second register file; maintain the first valid data in the first register while executing at least a portion of the first thread on the second processor core; and access the first valid data in the first register file after migrating the first thread back to the first processor core. 2. The apparatus of claim 1 , wherein the caching circuitry is configured to restore the second valid data to the second register from the second backing memory region in response to an access of the second register by the second thread. 3. The apparatus of claim 1 , wherein, at the point in time, the apparatus is configured to store information specifying that the first valid data is not currently stored in the first backing memory region and that the second valid data is not currently stored in the second backing memory region. 4. The apparatus of claim 1 , wherein the caching circuitry is configured to store tag information for the second valid data when the second valid data is in the second register, wherein the tag information corresponds to a location in the second backing memory region for the second register; and wherein the caching circuitry is configured to store the second valid data in the location in the second backing memory region in response to the access of the second register by the first thread, wherein the access by the first thread uses a tag that does not match the tag information. 5. The apparatus of claim 1 , wherein the memory comprises an on-chip memory. 6. The apparatus of claim 1 , further comprising: a data cache configured to cache data for the first register file, wherein the first and second backing memory regions are accessible to the caching circuitry without using the data cache. 7. The apparatus of claim 1 , wherein the apparatus is configured to execute instructions of the first thread and not instructions of the second thread during a time interval that includes the point in time. 8. The apparatus of claim 1 , wherein the caching circuitry includes a valid field, a modified field, and a tag field for one or more registers of the plurality of registers. 9. The apparatus of claim 1 , further comprising: a plurality of different processing elements configured to separately execute threads using a plurality of respective register files; wherein a first one of the plurality of different processing elements that includes the first register file is configured, in response to a write of particular data to a register corresponding to the second register in a different register file of the plurality of respective register files by a second one of the plurality of different processing elements, to write the particular data to the second register or to invalidate the second valid data in the second register. 10. The apparatus of claim 1 , wherein the second valid data is modified data. 11. The apparatus of claim 1 , wherein the apparatus is configured to retrieve the second valid data from the backing memory region in response to an access of the second register by the second thread, subsequent to the apparatus storing the second valid data. 12. A method, comprising: storing, by caching circuitry, in an computing system that includes a first register file that comprises a plurality of registers, storing information indicating threads that correspond to data stored in respective ones of the plurality of registers; and storing, by the computing system storing, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers; and storing, in response to an access of the second register by the first thread and based on information stored in the caching circuitry, the second valid data in a backing region in a memory for the second thread; migrating the first thread from a first processor core that includes the first register file for execution on a second, different processor core that includes a second register file; maintaining the first valid data in the first register while executing at least a portion of the first thread on the second processor core; and accessing the first valid data in the first register file after migrating the first thread back to the first processor core. 13. The method of claim 12 , further comprising: refraining from storing the second valid data in the backing memory region at least until an instruction from a thread other than the second thread is determined to access the second register. 14. The method of claim 12 , further comprising: restoring the second valid data to the second register in response to an access of the second register by the second thread. 15. The method of claim 12 , wherein the storing the second valid data in the backing memory region comprises storing without using a low-level data cache. 16. The method of claim 12 , further comprising: maintaining validity information, modified information, and tag information for the first register file. 17. An apparatus, comprising: a first register file comprising a plurality of registers; caching circuitry configured to store information that indicates threads that correspond to data stored in registers of the plurality of registers, wherein the caching circuitry includes a valid field, a modified field, and a tag field for one or more registers of the plurality of registers; wherein the apparatus is configured to store, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers; wherein the apparatus is configured to store, in response to an access of the second register by the first thread and based on information stored in the caching circuitry, the second valid data in a backing region in a memory for the second thread; and wherein the apparatus is configured to: migrate the first thread from a first processor core that includes the first register file for execution on a second, different processor core that includes a second register file; maintain the first valid data in the first register while executing at least a portion of the first thread on the second processor core; and access the first valid data in the first register file after migrating the first thread back to the first processor core

Assignees

Inventors

Classifications

  • according to context, e.g. thread buffers · CPC title

  • Register arrangements · CPC title

  • controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

  • from multiple instruction streams, e.g. multistreaming · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9817664B2 cover?
Techniques are disclosed relating to register caching techniques for thread switches. In one embodiment, an apparatus includes a register file and caching circuitry. In this embodiment, the register file includes a plurality of registers and the caching circuitry is configured to store information that indicates threads that correspond to data stored in respective ones of the plurality of regis…
Who is the assignee on this patent?
Apple Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/30123. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 14 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).