Data processing apparatus having a cache
US-2016321182-A1 · Nov 3, 2016 · US
US9804666B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9804666-B2 |
| Application number | US-201514721304-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 26, 2015 |
| Priority date | May 26, 2015 |
| Publication date | Oct 31, 2017 |
| Grant date | Oct 31, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Units of shader work, such as warps or wavefronts, are grouped into clusters. An individual vector register file of a processor is operated as segments, where a segment may be independently operated in an active mode or a reduced power data retention mode. The scheduling of the clusters is selected so that a cluster is allocated a segment of the vector register file. Additional sequencing may be performed for a cluster to reach a synchronization point. Individual segments are placed into the reduced power data retention mode during a latency period when the cluster is waiting for execution of a request, such as a sample request.
Opening claim text (preview).
What is claimed is: 1. A method of reducing power consumption in a shader of a graphics processing system, the method comprising: organizing a vector register file into a plurality of segments of physical memory, with each segment having an active mode and a reduced power data retention mode independently selectable from other segments of the vector register file; allocating each of the segments as a resource for a respective one of a plurality of clusters of multiple shader units of work assigned to a processor and having temporal locality and spatial locality; scheduling execution of the clusters in a sequence; and placing each of the segments that are respectively associated with the clusters that are in an inactive state into the reduced power data retention mode during at least a portion of a latency period for a texture load for the clusters. 2. The method of claim 1 , wherein the clusters are placed into the inactive state in response to completion of sending texture sample or memory load store commands of the cluster to an external unit. 3. The method of claim 1 , wherein the clusters are placed into the inactive state in response to completion of sending texture sample or memory load store commands of the cluster to a texture unit. 4. The method of claim 1 , further comprising using the vector register file as a resource for units of shader work in which each unit of shader work comprises a group of shader threads to perform Single Instruction Multiple Thread (SIMT) processing. 5. The method of claim 1 , further comprising prioritizing the shader units of work within each of the clusters to reach a synchronization point for loading a texture sample. 6. The method of claim 1 , wherein the clusters are assigned to consecutive shader tasks of a shader stage. 7. The method of claim 1 , wherein each shader unit of work is a unit of thread scheduling. 8. A method of reducing power consumption in a shader of a graphics processing system, the method comprising: scheduling clusters of shader work for a plurality of processors, each cluster including a plurality of shader units of work assigned to a processor and having temporal locality and spatial locality; for each cluster, allocating a respective segment of physical memory of a vector register file as a resource, each segment having an active mode and a reduced power data retention mode independently selectable from other segments; scheduling execution of the clusters in a sequence; rotating execution of the clusters; and placing segments of inactive clusters into the reduced power data retention mode during at least a portion of a latency period for a texture load for the inactive clusters. 9. The method of claim 8 , further comprising placing segments of inactive clusters into the reduced power data retention mode during at least a latency for a data access. 10. The method of claim 1 , further comprising placing the segments of each of the clusters awaiting a data load into the reduced power data retention mode. 11. The method of claim 8 , further comprising using the vector register file as a resource for units of shader work in which each unit of shader work has a group of shader threads to perform Single Instruction Multiple Thread (SIMT) processing. 12. The method of claim 8 , further comprising prioritizing the shader units of work within each cluster to reach a synchronization point for loading a texture sample. 13. The method of claim 8 , further comprising assigning the clusters to consecutive shader tasks of a shader stage. 14. The method of claim 8 , wherein each shader unit of work is a unit of thread scheduling. 15. A graphics processing unit, comprising: a plurality of programmable processors to perform Single Instruction Multiple Thread (SIMT) processing of shading instructions, each programmable processor including a vector register file having a plurality of data segments, each segment having an active mode and a reduced power data retention mode independently selectable from other segments; a scheduler to schedule clusters of shader work for the plurality of programmable processors, each cluster including a plurality of shader units of work assigned to an individual processor and having temporal locality and spatial locality, with each cluster supported by a segment of the vector register file of the assigned individual processor, the scheduler for selecting a schedule to rotate execution of the clusters to place segments of inactive clusters into the reduced power data retention mode during at least a portion of a latency period associated with an operation request by the cluster; and an external memory comprising a texture unit, wherein segments of inactive clusters are placed in the reduced power data retention mode during at least a portion of a latency period associated with accessing the external memory for a texture access of a cluster. 16. The graphics processing unit of claim 15 , further comprising a sequencer to prioritize the shader units of work within each cluster to reach a synchronization point. 17. The graphics processing unit of claim 15 , further comprising a load and store unit to access the external memory, wherein segments of inactive clusters are placed into the reduced power data retention mode during at least a portion of a latency period associated with accessing the external memory for a cluster. 18. A graphics processing unit, comprising: a shader including a programmable processing element; a vector register file used as a resource for units of shader work in which each unit of shader work has a group of shader threads to perform Single Instruction Multiple Thread (SIMT) processing and multiple groups of shader threads are formed into a cluster, the vector register file allocated as a plurality of individual segments; a scheduler to group clusters of units of shader work and select a schedule to assign an individual cluster to a segment of the vector register file and place the segment into a reduced power data retention mode during a latency period when the cluster is waiting for a result of a sample request during at least a portion of a latency period associated with an operation request by the cluster; and an external memory comprising a texture unit, wherein segments of inactive clusters are placed in the reduced power data retention mode during at least a portion of a latency period associated with accessing the external memory for a texture access of a cluster.
by lowering the supply or operating voltage · CPC title
Power supply means, e.g. regulation thereof (for memories G11C) · CPC title
Memory management · CPC title
Power saving in memory, e.g. RAM, cache · CPC title
Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.