Automated graphics and compute tile interleave

US10089775B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10089775-B2
Application numberUS-201514981395-A
CountryUS
Kind codeB2
Filing dateDec 28, 2015
Priority dateJun 4, 2015
Publication dateOct 2, 2018
Grant dateOct 2, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A graphics system interleaves a combination of graphics renderer operations and compute shader operations. A set of API calls is analyzed to determine dependencies and identify candidates for interleaving. A compute shader is adapted to have a tiled access pattern. The interleaving is scheduled to reduce a requirement to access an external memory to perform reads and writes of intermediate data.

First claim

Opening claim text (preview).

What is claimed is: 1. A method to reduce external memory accesses in a graphics system having an on-chip memory and an external memory, comprising: processing a combination of tiled graphics rendering operations and tiled compute shader operations of interdependent render targets, each of the interdependent render targets corresponding to a different image frame; and interleaving the combination of the tiled graphics rendering operations and tiled compute shader operations automatically based on the processing, wherein the tiled graphics rendering operations comprise raster operations, pixel processing operations, vertex processing operations, patch processing operations, or primitive processing operations, and wherein the tiled graphics rendering operations are separate from the tiled compute shader operations. 2. The method of claim 1 , wherein the interleaving comprises determining a schedule of a sequence of interleaved tiled operations selected such that at least one intermediate data result of a first operation of the sequence is directly consumed from the on-chip memory by a second operation of the sequence. 3. The method of claim 1 , wherein the interleaving is scheduled to have a sequence of interleaved tiled operations selected to reduce traffic to the external memory associated with intermediate data results. 4. The method of claim 1 , wherein a data access pattern of graphics rendering operations and compute shader operations is analyzed to determine whether a global memory barrier can be safely removed as a condition for interleaving and removing memory barriers. 5. The method of claim 1 , wherein the interleaving is performed on a tile basis and a compute shader configured to perform the compute shader operations is configured for a tiled access pattern. 6. The method of claim 5 , wherein a workgroup dimension of the compute shader is redefined to be an integer divisor of a width and a height of a tile. 7. The method of claim 6 , wherein the processing of the combination comprises replacing, for the compute shader, image load instructions with tile buffer load instructions. 8. The method of claim 1 , wherein the processing of the combination comprises analyzing API calls and grouping application programming interface (API) calls to build a sequence of interleaved execution of graphics rendering operations and compute shader operations. 9. The method of claim 1 , wherein the processing of the combination comprises generating a directed acyclic graph (DAG) of interdependent render targets and interdependent tiles and utilize the DAG to schedule interleaving of graphics rendering operations and compute shader operations. 10. The method of claim 1 , wherein the interleaving comprises grouping API calls to form a sequence of interleaved execution of graphics rendering operations and compute shader operations. 11. The method of claim 1 , wherein the processing of the combination comprises analyzing a data access pattern of load and store operation to determine candidates for interleaving. 12. The method of claim 11 , wherein the processing of the combination comprises identifying a data access pattern of a statically predetermined strided pattern as a candidate for interleaving. 13. In a graphics processing system, a method comprising: recompiling a compute shader to have a tiled access pattern; and interleaving processing of a graphics renderer and the recompiled compute shader for a set of interdependent images, each of the interdependent images corresponding to a different image frame, wherein the interleaving is performed on a tile-by-tile basis for the interdependent images, wherein the processing of the graphics renderer comprises raster operations, pixel processing operations, vertex processing operations, patch processing operations, or primitive processing operations, and wherein the processing of the graphics renderer is separate from the processing of the compute shader. 14. The method of claim 13 , wherein the interleaving processing has a sequence of interleaved operations selected to maintain at least some results of tile processing computations of dependent render targets in an on-chip memory of the graphics processing system. 15. The method of claim 13 , further comprising generating a directed acyclic graph (DAG) of interdependent render targets and interdependent tiles and utilizing the DAG to schedule interleaving of the processing of the graphics renderer and the recompiled compute shader. 16. The method of claim 13 , wherein the recompiling comprises: recompiling the compute shader to access and output data in a tiled pattern compatible with the tiled access pattern of the graphics renderer. 17. The method of claim 16 , wherein recompiling the compute shader to access and output data in a tiled pattern comprising redefining a workgroup dimension of the compute shader to be an integer divisor of a width and a height of a tile. 18. The method of claim 17 , further comprising replacing, for the compute shader, image load instructions with tile buffer load instructions. 19. The method of claim 13 , further comprising removing a memory barrier between the graphics renderer and the compute shader. 20. The method of claim 13 , wherein the interleaving is scheduled to have a sequence of interleaved tiled operations selected to maintain at least one intermediate data result, of an interleaved combination resulting from the interleaving, in an on-chip memory. 21. The method of claim 13 , wherein the interleaving is scheduled to have a sequence of interleaved tiled operations selected to reduce traffic to an external memory associated with intermediate data results. 22. A non-transitory computer readable medium having computer code, which when executed on a processor implements a method to: determine dependencies of graphics rendering operations and compute shader operations of a set of interdependent render target operations, each of the interdependent render target operations corresponding to a different image frame; and schedule an interleaved order of tile processing of interleaved graphics rendering and compute shader operations to reduce traffic to an external memory of a graphics system by maintaining at least a subset of intermediate tile processing computations of the interleaved graphics rendering and computer shader operations in an on-chip memory of a graphics processing unit, wherein the graphics rendering operations comprise raster operations, pixel processing operations, vertex processing operations, patch processing operations, or primitive processing operations, and wherein the graphics rendering operations are separate from the compute shader operations.

Assignees

Inventors

Classifications

  • G06T15/005Primary

    General purpose rendering architectures · CPC title

  • General purpose image data processing · CPC title

  • Filling planar surfaces by adding surface attributes, e.g. adding colours or textures · CPC title

  • Memory management · CPC title

  • using colour registers, e.g. to control background, foreground, surface filling (G09G5/06 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10089775B2 cover?
A graphics system interleaves a combination of graphics renderer operations and compute shader operations. A set of API calls is analyzed to determine dependencies and identify candidates for interleaving. A compute shader is adapted to have a tiled access pattern. The interleaving is scheduled to reduce a requirement to access an external memory to perform reads and writes of intermediate data.
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06T15/005. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 02 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).