Register caching techniques for thread switches

US2016246728A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016246728-A1
Application numberUS-201514625956-A
CountryUS
Kind codeA1
Filing dateFeb 19, 2015
Priority dateFeb 19, 2015
Publication dateAug 25, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are disclosed relating to register caching techniques for thread switches. In one embodiment, an apparatus includes a register file and caching circuitry. In this embodiment, the register file includes a plurality of registers and the caching circuitry is configured to store information that indicates threads that correspond to data stored in respective ones of the plurality of registers. In this embodiment, the apparatus is configured to store, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers. In some embodiments, the disclosed techniques may reduce context switch latency, reduce pressure on a data cache, and/or allow smaller slices of thread execution, for example.

First claim

Opening claim text (preview).

What is claimed is: 1 . An apparatus, comprising: a register file comprising a plurality of registers; and caching circuitry configured to store information that indicates threads that correspond to data stored in registers of the plurality of registers; wherein the apparatus is configured to store, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers. 2 . The apparatus of claim 1 , further comprising: a memory, wherein the apparatus is configured to store data from the register file for the first and second threads in respective first and second backing memory regions in the memory based on context switches between threads; wherein the caching circuitry is configured to store the second valid data in the second backing memory region in response to an access of the second register by the first thread. 3 . The apparatus of claim 2 , wherein the caching circuitry is configured to restore the second valid data to the second register from the second backing memory region in response to an access of the second register by the second thread. 4 . The apparatus of claim 2 , wherein, at the point in time, the apparatus is configured to store information specifying that the first valid data is not currently stored in the first backing memory region and that the second valid data is not currently stored in the second backing memory region. 5 . The apparatus of claim 2 , wherein the caching circuitry is configured to store tag information for the second valid data when the second valid data is in the second register, wherein the tag information corresponds to a location in the second backing memory region for the second register; and wherein the caching circuitry is configured to store the second valid data in the location in the second backing memory region in response to the access of the second register by the first thread, wherein the access by the first thread uses a tag that does not match the tag information. 6 . The apparatus of claim 2 , wherein the memory comprises an on-chip memory. 7 . The apparatus of claim 2 , further comprising: a data cache configured to cache data for the register file, wherein the first and second backing memory regions are accessible to the caching circuitry without using the data cache. 8 . The apparatus of claim 1 , wherein the apparatus is configured to execute instructions of the first thread and not instructions of the second thread during a time interval that includes the point in time. 9 . The apparatus of claim 1 , wherein the caching circuitry includes a valid field, a modified field, and a tag field for one or more registers of the plurality of registers. 10 . The apparatus of claim 1 , further comprising: a plurality of different processing elements configured to separately execute threads using a plurality of respective register files; wherein a first one of the plurality of different processing elements that includes the register file is configured, in response to a write of particular data to a register corresponding to the second register in a different register file of the plurality of respective register files by a second one of the plurality of different processing elements, to write the particular data to the second register or to invalidate the second valid data in the second register. 11 . The apparatus of claim 1 , further comprising: a plurality of different processing elements configured to separately execute threads using a plurality of respective register files; wherein a first one of the plurality of different processing elements that includes the register file is configured to store valid, modified data in the register file for a thread that is executing on a second one of the plurality of different processing elements. 12 . A method, comprising: caching circuitry, in an computing system that includes a register file that comprises a plurality of registers, storing information indicating threads that correspond to data stored in respective ones of the plurality of registers; and the computing system storing, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers. 13 . The method of claim 12 , further comprising: storing the second valid data in a backing memory region in a memory in response to an instruction in the first thread that accesses the second register. 14 . The method of claim 13 , further comprising: refraining from storing the second valid data in the backing memory region at least until an instruction from a thread other than the second thread is determined to access the second register. 15 . The method of claim 13 , further comprising: restoring the second valid data to the second register in response to an access of the second register by the second thread. 16 . The method of claim 13 , wherein the storing the second valid data in the backing memory region comprises storing without using a low-level data cache. 17 . The method of claim 12 , further comprising: maintaining validity information, modified information, and tag information for the register file. 18 . An apparatus, comprising: a register file comprising a plurality of registers; caching circuitry configured to store information specifying threads that correspond to data stored in respective ones of the plurality of registers; a memory, wherein the apparatus is configured to store data from the register file for first and second threads in respective first and second backing memory regions in the memory; wherein the apparatus is configured to store, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers; and wherein the apparatus is configured to store the second valid data in the second backing memory region in response to an access of the second register by the first thread. 19 . The apparatus of claim 18 , wherein the second valid data is modified data. 20 . The apparatus of claim 18 , wherein the apparatus is configured to retrieve the second valid data from the backing memory region in response to an access of the second register by the second thread, subsequent to the apparatus storing the second valid data.

Assignees

Inventors

Classifications

  • according to context, e.g. thread buffers · CPC title

  • Instruction code · CPC title

  • Register arrangements · CPC title

  • with dedicated cache, e.g. instruction or stack · CPC title

  • controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016246728A1 cover?
Techniques are disclosed relating to register caching techniques for thread switches. In one embodiment, an apparatus includes a register file and caching circuitry. In this embodiment, the register file includes a plurality of registers and the caching circuitry is configured to store information that indicates threads that correspond to data stored in respective ones of the plurality of regis…
Who is the assignee on this patent?
Apple Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/30123. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 25 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).