Optimizing shading process for mixed order-sensitive and order-insensitive shader operations
US-2016307365-A1 · Oct 20, 2016 · US
US10019776B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10019776-B2 |
| Application number | US-201514924624-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 27, 2015 |
| Priority date | Oct 27, 2015 |
| Publication date | Jul 10, 2018 |
| Grant date | Jul 10, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.
Opening claim text (preview).
What is claimed is: 1. A graphics subsystem configured to process tiles of coverage samples, the subsystem comprising: a first counter associated with a first screen space region to issue first data to a first thread associated with a first tile residing within the first screen space region, the first counter comprising a ticket ordering register; and a second counter associated with the first screen space region to output second data to the first thread, the second counter comprising a ticket dispenser register, wherein the first thread initiates processing operations on the first tile based on a comparison between the first data and the second data, wherein the first thread is executed by a parallel processor. 2. The graphics subsystem of claim 1 , wherein the first data includes an identifier associated with the second counter, and the first thread parses the first data to extract the identifier and access the second counter to obtain the second data. 3. The graphics subsystem of claim 1 , wherein the first data includes a first value, the second data includes a second value, and the first thread initiates processing operations on the first tile upon determining that the first value is equivalent to the second value. 4. The graphics subsystem of claim 1 , wherein the first thread further causes a set of additional threads to perform the processing operations on the first tile, wherein the set of additional threads are executed by the parallel processor. 5. The graphics subsystem of claim 4 , wherein each thread in a subset of threads in the set of additional threads adds a weight value to the second counter upon completion of a portion of the processing operations on the first tile. 6. The graphics subsystem of claim 5 , wherein the second counter overflows and increments to a third value after each thread in the subset of threads adds the weight value to the second counter. 7. The graphics subsystem of claim 6 , wherein a first additional thread included in the set of additional threads adds a first weight value to the second counter upon completing the portion of the processing operations associated with the first additional thread, and a second additional thread included in the set of additional threads adds a second value to the second counter upon completing the portion of the processing operations associated with the second additional thread, wherein the first additional thread and the second additional thread are executed by the parallel processor. 8. The graphics subsystem of claim 7 , wherein the first weight value is equal to one, and the second weight value is equal to the difference between a size attribute of the second counter and a number of additional threads in the set of additional threads. 9. The graphics subsystem of claim 1 , wherein the second counter increments to a third value when the processing operations being performed on the first tile are complete, and a second thread initiates processing operations on a second tile after the second counter increments to the third value, wherein the second thread is executed by the parallel processor. 10. The graphics subsystem of claim 1 , wherein the first tile includes a first coverage sample, the second tile includes a second coverage sample, and the first coverage sample and the second coverage sample are associated with an application programming interface order, and wherein the first tile and the second tile are processed according to the application programming interface order associated with the first coverage sample and the second coverage sample. 11. The graphics subsystem of claim 10 , wherein the first coverage sample and the second coverage sample are associated with a first X-Y position within the first screen space region. 12. The graphics subsystem of claim 1 , wherein the processing operations performed on the first tile comprise programmable blending operations. 13. A computer-implemented method for processing tiles of coverage samples, the method comprising: causing a first counter associated with a first screen space region to issue first data to a first thread associated with a first tile residing within the first screen space region, the first counter comprising a ticket ordering register; and causing a second counter associated with the first screen space region to output second data to the first thread, the second counter comprising a ticket dispenser register, wherein the first thread is configured to initiate processing operations on the first tile based on a comparison between the first data and the second data. 14. The computer-implemented method of claim 13 , wherein the first data includes a first value, the second data includes a second value, and the first thread is configured to initiate processing operations on the first tile upon determining that the first value is equivalent to the second value. 15. The computer-implemented method of claim 13 , wherein the first thread is further configured to cause a set of additional threads to perform the processing operations on the first tile, and each thread in a subset of threads in the set of additional threads adds a weight value to the second counter upon completion of a portion of the processing operations on the first tile, thereby causing the second counter to overflow and increment to a third value after each thread in the subset of threads adds the weight value to the second counter. 16. The computer-implemented method of claim 15 , wherein the first tile includes a first coverage sample, the second tile includes a second coverage sample, and the first coverage sample and the second coverage sample are associated with an application programming interface order, and wherein the first tile and the second tile are processed according to the application programming interface order associated with the first coverage sample and the second coverage sample. 17. The computer-implemented method of claim 15 , wherein a first thread in the set of additional threads is configured to perform a first portion of the processing operations on a first multiprocessor, and a second thread in the set of additional threads is configured to perform a second portion of the processing operations on a second multiprocessor. 18. The computer-implemented method of claim 15 , wherein a first thread in the set of additional threads is configured to perform a first portion of the processing operations on a first multiprocessor, and a second thread in the set of additional threads is configured to perform a second portion of the processing operations on the first multiprocessor. 19. A computing device, comprising: a first multiprocessor; and a thread management unit coupled to the first multiprocessor and including: a first counter associated with a first screen space region and configured to issue first data to a first thread associated with a first tile residing within the first screen space region, the first counter comprising a ticket ordering register, and a second counter associated with the first screen space region and configured to output second data to the first thread, the second counter comprising a ticket dispenser register, wherein the first thread is configured to cause the first multiprocessor to perform processing operations on the first tile based on a comparison between the first data and the second data. 20. The computing device of claim 19 , wherein the first data includes a first value, the second data includes a second value, and the first thread is configured to initiate process
Related publications grouped by family.
Answers are generated from the same data shown on this page.