Techniques for maintaining atomicity and ordering for pixel shader operations

US10019776B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10019776-B2
Application numberUS-201514924624-A
CountryUS
Kind codeB2
Filing dateOct 27, 2015
Priority dateOct 27, 2015
Publication dateJul 10, 2018
Grant dateJul 10, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.

First claim

Opening claim text (preview).

What is claimed is: 1. A graphics subsystem configured to process tiles of coverage samples, the subsystem comprising: a first counter associated with a first screen space region to issue first data to a first thread associated with a first tile residing within the first screen space region, the first counter comprising a ticket ordering register; and a second counter associated with the first screen space region to output second data to the first thread, the second counter comprising a ticket dispenser register, wherein the first thread initiates processing operations on the first tile based on a comparison between the first data and the second data, wherein the first thread is executed by a parallel processor. 2. The graphics subsystem of claim 1 , wherein the first data includes an identifier associated with the second counter, and the first thread parses the first data to extract the identifier and access the second counter to obtain the second data. 3. The graphics subsystem of claim 1 , wherein the first data includes a first value, the second data includes a second value, and the first thread initiates processing operations on the first tile upon determining that the first value is equivalent to the second value. 4. The graphics subsystem of claim 1 , wherein the first thread further causes a set of additional threads to perform the processing operations on the first tile, wherein the set of additional threads are executed by the parallel processor. 5. The graphics subsystem of claim 4 , wherein each thread in a subset of threads in the set of additional threads adds a weight value to the second counter upon completion of a portion of the processing operations on the first tile. 6. The graphics subsystem of claim 5 , wherein the second counter overflows and increments to a third value after each thread in the subset of threads adds the weight value to the second counter. 7. The graphics subsystem of claim 6 , wherein a first additional thread included in the set of additional threads adds a first weight value to the second counter upon completing the portion of the processing operations associated with the first additional thread, and a second additional thread included in the set of additional threads adds a second value to the second counter upon completing the portion of the processing operations associated with the second additional thread, wherein the first additional thread and the second additional thread are executed by the parallel processor. 8. The graphics subsystem of claim 7 , wherein the first weight value is equal to one, and the second weight value is equal to the difference between a size attribute of the second counter and a number of additional threads in the set of additional threads. 9. The graphics subsystem of claim 1 , wherein the second counter increments to a third value when the processing operations being performed on the first tile are complete, and a second thread initiates processing operations on a second tile after the second counter increments to the third value, wherein the second thread is executed by the parallel processor. 10. The graphics subsystem of claim 1 , wherein the first tile includes a first coverage sample, the second tile includes a second coverage sample, and the first coverage sample and the second coverage sample are associated with an application programming interface order, and wherein the first tile and the second tile are processed according to the application programming interface order associated with the first coverage sample and the second coverage sample. 11. The graphics subsystem of claim 10 , wherein the first coverage sample and the second coverage sample are associated with a first X-Y position within the first screen space region. 12. The graphics subsystem of claim 1 , wherein the processing operations performed on the first tile comprise programmable blending operations. 13. A computer-implemented method for processing tiles of coverage samples, the method comprising: causing a first counter associated with a first screen space region to issue first data to a first thread associated with a first tile residing within the first screen space region, the first counter comprising a ticket ordering register; and causing a second counter associated with the first screen space region to output second data to the first thread, the second counter comprising a ticket dispenser register, wherein the first thread is configured to initiate processing operations on the first tile based on a comparison between the first data and the second data. 14. The computer-implemented method of claim 13 , wherein the first data includes a first value, the second data includes a second value, and the first thread is configured to initiate processing operations on the first tile upon determining that the first value is equivalent to the second value. 15. The computer-implemented method of claim 13 , wherein the first thread is further configured to cause a set of additional threads to perform the processing operations on the first tile, and each thread in a subset of threads in the set of additional threads adds a weight value to the second counter upon completion of a portion of the processing operations on the first tile, thereby causing the second counter to overflow and increment to a third value after each thread in the subset of threads adds the weight value to the second counter. 16. The computer-implemented method of claim 15 , wherein the first tile includes a first coverage sample, the second tile includes a second coverage sample, and the first coverage sample and the second coverage sample are associated with an application programming interface order, and wherein the first tile and the second tile are processed according to the application programming interface order associated with the first coverage sample and the second coverage sample. 17. The computer-implemented method of claim 15 , wherein a first thread in the set of additional threads is configured to perform a first portion of the processing operations on a first multiprocessor, and a second thread in the set of additional threads is configured to perform a second portion of the processing operations on a second multiprocessor. 18. The computer-implemented method of claim 15 , wherein a first thread in the set of additional threads is configured to perform a first portion of the processing operations on a first multiprocessor, and a second thread in the set of additional threads is configured to perform a second portion of the processing operations on the first multiprocessor. 19. A computing device, comprising: a first multiprocessor; and a thread management unit coupled to the first multiprocessor and including: a first counter associated with a first screen space region and configured to issue first data to a first thread associated with a first tile residing within the first screen space region, the first counter comprising a ticket ordering register, and a second counter associated with the first screen space region and configured to output second data to the first thread, the second counter comprising a ticket dispenser register, wherein the first thread is configured to cause the first multiprocessor to perform processing operations on the first tile based on a comparison between the first data and the second data. 20. The computing device of claim 19 , wherein the first data includes a first value, the second data includes a second value, and the first thread is configured to initiate process

Assignees

Inventors

Classifications

  • Memory management · CPC title

  • G06T1/20Primary

    Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • Filling planar surfaces by adding surface attributes, e.g. adding colours or textures · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10019776B2 cover?
A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each X…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 10 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).