Graphics processing hardware for using compute shaders as front end for vertex shaders

US10134102B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10134102-B2
Application numberUS-201414297290-A
CountryUS
Kind codeB2
Filing dateJun 5, 2014
Priority dateJun 10, 2013
Publication dateNov 20, 2018
Grant dateNov 20, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A GPU is configured to read and process data produced by a compute shader via the one or more ring buffers and pass the resulting processed data to a vertex shader as input. The GPU is further configured to allow the compute shader and vertex shader to write through a cache. Each ring buffer is configured to synchronize the compute shader and the vertex shader to prevent processed data generated by the compute shader that is written to a particular ring buffer from being overwritten before the data is accessed by the vertex shader. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.

First claim

Opening claim text (preview).

What is claimed is: 1. A graphics processing system, comprising: a graphics processor unit (GPU); a cache implemented on the GPU; one or more ring buffers implemented by the GPU; a compute shader configured to run on the GPU; and a vertex shader configured to run on the GPU, a global data share implemented on the GPU that is accessible by the compute shader and the vertex shader; wherein the GPU is configured to read and process data produced by the compute shader via the one or more ring buffers and pass the resulting processed data to the vertex shader as input, wherein the GPU is further configured to allow the one or more compute shaders and the one or more vertex shaders to read and write through the cache, wherein the one or more ring buffers are configured to synchronize the compute shader and the vertex shader to prevent processed data generated by the compute shader that is written to a particular ring buffer of the one or more ring buffers from being overwritten in the particular ring buffer before the data is accessed by the vertex shader, wherein the one or more ring buffers are configured to allow the compute shader to perform ordered allocations on the one or more ring buffers and the vertex shader to perform ordered de-allocations on the one or more ring buffers wherein the compute shader writes to an atomic counter in the global data share to synchronize buffer access. 2. The system of claim 1 wherein the one or more ring buffers include an index ring buffer, wherein the index ring buffer stores one or more indices for one or more vertices of one or more polygons. 3. The system of claim 2 , wherein the compute shader is configured to perform culling to determine whether one or more of the one or more vertices require processing by the vertex shader, wherein the processed data includes a culled index table identifying a subset of the one or more vertices that require further processing by the vertex shader, wherein culling removes indices of polygons which have no visible area and the culled index table does not include indices for polygons that have no visible area. 4. The system of claim 1 , further comprising a command processor wherein the compute shader is configured to send a notification to the command processor that there is data in the ring buffer for the vertex shader and wherein the command processor is configured to convert the notification to commands for GPU hardware to read the data. 5. The system of claim 4 wherein the vertex shader is configured to notify the command processor that the vertex shader is done with data from the ring buffer when the vertex shader is done with the data from the ring buffer. 6. The system of claim 4 , wherein the command processor is configured to pass index data to data reading hardware to generate vertex wavefronts. 7. The system of claim 4 , wherein the command processor is configured to implement instructions to test an amount of space available in a particular ring buffer of the one or more ring buffers, wherein the command processor allocates space in the particular ring buffer when there is sufficient space in the particular ring buffer for particular processed data generated by the compute shader, and wherein the command processor is configured to stall until sufficient space is available in the particular ring buffer. 8. The system of claim 1 , wherein the global data share is configured to implement the atomic counter that the compute shader and command processor use to implement one or more notifications from the compute shader that particular processed data is ready for the vertex shader in the one or more ring buffers. 9. The system of claim 1 , wherein the global data share is configured to implement an atomic counter that the vertex shader uses to notify the compute shader that the vertex shader is done with particular processed data in the one or more ring buffers. 10. The system of claim 1 wherein the one or more ring buffers include a vertex ring buffer and the data includes location data for one or more vertices of one or more polygons. 11. The system of claim, 1 , wherein the system is an embedded system, mobile phone, personal computer, tablet computer, portable game device, workstation, or game console. 12. A graphics processing method, comprising: running a compute shader running on a graphics processor unit (GPU) configured to write to an image or shader storage block in one or more ring buffers through a cache implemented on the GPU; and running a vertex shader on the GPU configured to access data in memory written by compute shader by reading the data written by the vertex shader through the cache; synchronizing the compute shader and the vertex shader to prevent processed data generated by the compute shader that is written to the one or more ring buffers from being overwritten in the one or more ring buffers before the data is accessed by the vertex shader, wherein the one or more ring buffers are configured to allow the compute shader to perform ordered allocations on the one or more ring buffers and the vertex shader to perform ordered de-allocations on the one or more ring buffers wherein the compute shader writes to an atomic counter in a global data share to synchronize buffer access, where the global data share is implemented on the GPU and is accessible by the compute shader and the vertex shader. 13. The method of claim 12 wherein the one or more ring buffers includes an index ring buffer, wherein the index ring buffer stores one or more indices for one or more vertices of one or more polygons. 14. The method of claim 13 , wherein the compute shader performs culling to determine whether one or more of the one or more vertices require processing by the vertex shader, wherein the data written by the compute shader includes a culled index table identifying a subset of the one or more vertices that require further processing by the vertex shader. 15. The method of claim 12 , further comprising sending a notification to a command processor on the GPU that there is data in the ring buffer for the vertex shader and converting the notification to commands for GPU hardware to read the data with the command processor. 16. The method of claim 15 , wherein synchronizing the compute shader and the vertex shader includes notifying the command processor with the vertex shader that the vertex shader is done with data from the ring buffer when the vertex shader is done with the data from the ring buffer. 17. The method of claim 15 , further comprising passing index data with the command processor to data reading hardware in the GPU to generate vertex wavefronts. 18. The method of claim 12 , wherein the one or more ring buffers include a vertex ring buffer and the data includes location data for one or more vertices of one or more polygons. 19. The method of claim 12 , wherein synchronizing the compute shader and the vertex shader includes using a global data share implemented on the GPU that is accessible by the compute shader and the vertex shader to implement an atomic counter that the compute shader and command processor use to implement one or more notifications from the compute shader that particular processed data is ready for the vertex shader in the one or more ring buffers or to notify the compute shader that the vertex shader is done with particular processed data in the one or more ring buffers.

Assignees

Inventors

Classifications

  • Memory management · CPC title

  • G06T1/20Primary

    Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • General purpose rendering architectures · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10134102B2 cover?
A GPU is configured to read and process data produced by a compute shader via the one or more ring buffers and pass the resulting processed data to a vertex shader as input. The GPU is further configured to allow the compute shader and vertex shader to write through a cache. Each ring buffer is configured to synchronize the compute shader and the vertex shader to prevent processed data generate…
Who is the assignee on this patent?
Sony Interactive Entertainment Inc, Advanced Micro Devices Inc
What technology area does this patent fall under?
Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 20 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).