Sparse convolutional neural network accelerator
US-10891538-B2 · Jan 12, 2021 · US
US11670044B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11670044-B2 |
| Application number | US-202217723328-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 18, 2022 |
| Priority date | Apr 21, 2017 |
| Publication date | Jun 6, 2023 |
| Grant date | Jun 6, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One embodiment provides for a graphics processing unit comprising a processing cluster to perform coarse pixel shading and output shaded coarse pixels for processing by a pixel processing pipeline and a render cache to store coarse pixel data for input to or output from a pixel processing pipeline.
Opening claim text (preview).
What is claimed is: 1. A graphics processor comprising: a processing cluster including a plurality of processing elements configured to perform coarse pixel shading and output shaded coarse pixels for processing by a post-shader pixel pipeline; a render cache to store coarse pixel data processed by and output from a pixel processing unit of the post-shader pixel pipeline; and a graphics processor cache to store coarse pixel data evicted from the render cache as a coarse pixel. 2. The graphics processor as in claim 1 , wherein the render cache of the graphics processing unit is additionally to store coarse pixel data for input to the pixel processing unit of the post-shader pixel pipeline. 3. The graphics processor as in claim 1 , wherein the pixel processing unit is configured to perform a post-shader pixel processing operation on the coarse pixel. 4. The graphics processor as in claim 3 , wherein the post-shader pixel processing operation includes a stencil, depth, or blend operation. 5. The graphics processor as in claim 1 , wherein the processing cluster is configurable to adjust a scale factor of a coarse pixel during the coarse pixel shading. 6. The graphics processor as in claim 1 , wherein the pixel pipeline of the graphics processing unit includes a fragment compression unit to implement cacheline aware fragment compression. 7. The graphics processor as in claim 6 , wherein the fragment compression unit is to configure a set of pixels associated with a single cacheline of the render cache to be rendered by the post-shader pixel pipeline as a coarse pixel. 8. The graphics processor as in claim 1 , wherein the render cache of the graphics processing unit includes a cache allocation unit to perform cacheline aware fragment expansion of a set of coarse pixels. 9. The graphics processor as in claim 8 , wherein the cache allocation unit is configured to expand a coarse pixel quad into a pixel quad based on a cache line status associated with the coarse pixel quad. 10. The graphics processor as in claim 1 , wherein the post-shader pixel pipeline includes a cache read module to issue a read request to the render cache, the read request to read a coarse pixel quad from the render cache. 11. The graphics processor as in claim 1 , wherein the post-shader pixel pipeline includes a cache write module to issue a write request to the render cache, the write request to write a coarse pixel quad to the render cache. 12. A method comprising: performing coarse pixel shading and outputting shaded coarse pixels for processing by a post-shader pixel pipeline via a processing cluster including a plurality of processing elements; storing coarse pixel data in a render cache, the coarse pixel processed by and output from a pixel processing unit of the post-shader pixel pipeline; and storing coarse pixel data evicted from the render cache to a graphics processor cache as a coarse pixel. 13. The method as in claim 12 , further comprising storing coarse pixel data for input to the pixel processing unit of the post-shader pixel pipeline in the render cache and performing a post-shader pixel processing operation on the coarse pixel via the pixel processing unit of the post-shader pixel pipeline. 14. The method as in claim 13 , wherein the post-shader pixel processing operation includes a stencil, depth, or blend operation. 15. The method as in claim 12 , further comprising implementing cacheline aware fragment compression via a fragment compression unit of the post-shader pixel pipeline. 16. The method as in claim 15 , wherein the cacheline aware fragment compression configures a set of pixels associated with a single cacheline of the render cache to be rendered by the post-shader pixel pipeline as a coarse pixel. 17. The method as in claim 12 , wherein the render cache of the graphics processing unit includes a cache allocation unit to perform cacheline aware fragment expansion of a set of coarse pixels. 18. The method as in claim 17 , wherein the cache allocation unit is configured to expand a coarse pixel quad into a pixel quad based on a cache line status associated with the coarse pixel quad. 19. The method as in claim 12 , wherein the post-shader pixel pipeline includes a cache read module to issue a read request to the render cache, the read request to read a coarse pixel quad from the render cache. 20. The method as in claim 12 , wherein the post-shader pixel pipeline includes a cache write module to issue a write request to the render cache, the write request to write a coarse pixel quad to the render cache.
Processor architectures; Processor configuration, e.g. pipelining · CPC title
Parallel processing · CPC title
Memory management · CPC title
Shading · CPC title
General purpose rendering architectures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.