Programmable graphics processor for multithreaded execution of programs

US9659339B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9659339-B2
Application numberUS-201313850175-A
CountryUS
Kind codeB2
Filing dateMar 25, 2013
Priority dateOct 29, 2003
Publication dateMay 23, 2017
Grant dateMay 23, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a processor unit; and a graphics processing unit that includes at least one execution pipeline that is configured to execute a plurality of threads to process vertex data and to execute a plurality of threads to process fragment data, wherein the at least one execution pipeline includes at least one programmable computation unit that is programmable to process vertex data during a first pass and fragment data during a subsequent pass based on received program instructions. 2. The system of claim 1 , further comprising a memory that stores vertex program instructions that are executed by two or more threads in the at least one execution pipeline. 3. The system of claim 2 , wherein the memory further stores fragment program instructions that are executed by two or more threads in the at least one execution pipeline. 4. The system of claim 2 , wherein the graphics processing unit further includes a texture unit configured to read texture maps from the memory. 5. The system of claim 1 , wherein during the first pass through the at least one execution pipeline, a first plurality of threads is configured to perform vertex processing operations on vertex data. 6. The system of claim 1 , wherein, during the subsequent pass through the at least one execution pipeline, a second plurality of threads is configured to perform fragment processing operations on fragment data. 7. The system of claim 1 , wherein the graphics processing unit further includes a raster unit configured to output fragment data and to perform scan conversion operations. 8. The system of claim 1 , wherein the graphics processing unit further includes a raster operations unit configured to perform blend operations based on processed fragment data received from the at least one execution pipeline. 9. The system of claim 1 , further comprising a control unit configured to control whether, in a given pass through the at least one execution pipeline, the at least one execution pipeline is configured to execute the plurality of threads to process vertex data or the plurality of threads to process fragment data. 10. The system of claim 9 , wherein the control unit is configured to control the at least one execution pipeline to execute the plurality of threads to process vertex data during the first pass and to execute the plurality of threads to process fragment data during the subsequent pass. 11. The system of claim 1 , wherein during the first pass, the programmable computation unit processes the vertex data to produce processed vertex data, and during the subsequent pass, the programmable computation unit processes the fragment data in accordance with the processed vertex data. 12. A graphics processing unit, comprising: at least one execution pipeline configured to execute a plurality of threads to process vertex data and to execute a plurality of threads to process fragment data, wherein the at least one execution pipeline includes at least one programmable computation unit that is programmable to process vertex data during a first pass and fragment data during a subsequent pass based on received program instructions. 13. The graphics processing unit of claim 12 , further comprising a memory that stores vertex program instructions that are executed by two or more threads in the at least one execution pipeline. 14. The graphics processing unit of claim 13 , wherein the memory further stores fragment program instructions that are executed by two or more threads in the at least one execution pipeline. 15. The graphics processing unit of claim 13 , wherein the graphics processing unit further includes a texture unit configured to read texture maps from the memory. 16. The graphics processing unit of claim 12 , wherein during the first pass through the at least one execution pipeline, a first plurality of threads is configured to perform vertex processing operations on vertex data. 17. The graphics processing unit of claim 12 , wherein, during the subsequent pass through the at least one execution pipeline, a second plurality of threads is configured to perform fragment processing operations on fragment data. 18. The graphics processing unit of claim 12 , wherein the graphics processing unit further includes a raster unit configured to output fragment data and to perform scan conversion operations. 19. The graphics processing unit of claim 12 , wherein the graphics processing unit further includes a raster operations unit configured to perform blend operations based on processed fragment data received from the at least one execution pipeline. 20. A method, comprising: in a first pass through at least one execution pipeline, executing a first plurality of threads to process vertex data, wherein each execution pipeline is configured to execute a plurality of threads to process vertex data and to execute a plurality of threads to process fragment data; and in a subsequent pass through the at least one execution pipeline, executing a second plurality of threads to process fragment data, wherein the at least one execution pipeline includes at least one programmable computation unit that is programmable to process vertex data during the first pass and fragment data during the subsequent pass based on received program instructions. 21. The method of claim 20 , further comprising performing one or more scan conversion operations after the first pass through the at least one execution pipeline and before the subsequent pass through the at least one execution pipeline. 22. The method of claim 20 , further comprising performing at least one blend operation based on processed fragment data output from the at least one execution pipeline. 23. The method of claim 20 , further comprising storing a vertex program that includes vertex program instructions that are executed by two or more threads in the at least one execution pipeline, and storing a fragment program that includes fragment program instructions that are executed by two or more threads in the at least one execution pipeline.

Assignees

Inventors

Classifications

  • from multiple instruction streams, e.g. multistreaming · CPC title

  • G06T1/20Primary

    Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • Concurrent instruction execution, e.g. pipeline or look ahead · CPC title

  • Instruction completion, e.g. retiring, committing or graduating · CPC title

  • controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9659339B2 cover?
A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and …
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 23 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).