Method and apparatus for a highly efficient graphics processing unit (GPU) execution model
US-10521874-B2 · Dec 31, 2019 · US
US2019213776A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2019213776-A1 |
| Application number | US-201815864833-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 8, 2018 |
| Priority date | Jan 8, 2018 |
| Publication date | Jul 11, 2019 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One disclosed embodiment includes a method of scheduling graphics commands for processing. A plurality of micro-commands is generated based on one or more graphics commands obtained from a central processing unit. The dependency between the one or more graphics commands is then determined and an execution graph is generated based on the determined dependency. Each micro-command in the execution graph is connected by an edge to the other micro-commands that it depends on. A wait count is defined for each micro-command of the execution graph, where the wait count indicates the number of micro-commands that the each particular micro-command depends on. One or more micro-commands with a wait count of zero are transmitted to a ready queue for processing.
Opening claim text (preview).
What is claimed: 1 . A method of low-latency graphics processing, comprising: obtaining, from a central processing unit (CPU), a plurality of graphics commands; generating, based on the plurality of graphics commands, a plurality of micro-commands; determining dependency between the plurality of micro-commands; creating an execution graph based on the determined dependencies, wherein each micro-command is connected by an edge to another micro-command that it depends on; defining a wait count for each micro-command within the execution graph, wherein the wait count for a particular micro-command is a number of micro-commands that the particular micro-command depends on; and transmitting one or more micro-commands with the wait count of zero to a ready queue for processing. 2 . The method of claim 1 , wherein determining comprises: registering the plurality of micro-commands in a data structure, wherein the data structure tracks the dependency between the plurality of micro-commands. 3 . The method claim 2 , wherein the data structure comprises a first hash table having an entry for each micro-command stored in the execution graph, each entry including a tuple identifying all of the micro-commands that depend on the entry's corresponding micro-command. 4 . The method of claim 1 , further comprising: determining, for the plurality of micro-commands, one or more priority categories, wherein each of the micro-commands of the plurality of micro-commands is associated with one of the one or more priority categories. 5 . The method of claim 4 , further comprising: receiving a priority policy to define priorities of the one or more priority categories, wherein the execution graph is further created based on the priority policy. 6 . The method of claim 1 , further comprising: updating the wait count for a first group of micro-commands within the execution graph after completion of processing of one or more transmitted micro-command that the first group of micro-commands depend upon. 7 . The method of claim 1 , wherein the execution graph comprises a directed acyclic graph. 8 . A method of low-latency graphics processing, comprising: analyzing an execution graph comprising a plurality of micro-commands, wherein the execution graph represents dependency between the plurality of micro-commands; identifying, based on the analysis, at least one micro-command for processing; storing the at least one micro-command in a ready queue for processing; receiving a first signal indicative that the at least one micro-command has completed processing; determining, in response to the first signal, one or more other micro-commands in the execution graph that are dependent on the at least one micro-command; and updating each identified dependent micro-command to reflect the at least one micro-command has finished processing. 9 . The method of claim 8 , wherein identifying further comprises: selecting at least one micro-command having a wait count of zero, wherein the wait count represents a number of micro-commands the at least one micro-command depends on. 10 . The method of claim 8 , wherein updating comprises: decrementing the wait count of each identified dependent command. 11 . The method of claim 8 , wherein storing comprises: storing the at least one micro-command in one of a pre-processing queue, a kick queue, and a post-processing queue. 12 . The method of claim 8 , wherein the execution graph comprises a directed acyclic graph. 13 . The method of claim 8 , further comprising determining a priority of the at least one micro-command. 14 . The method of claim 13 , further comprising selecting a first micro-command from the ready queue for processing based on a priority of the first micro-command. 15 . A non-transitory computer readable medium comprising instructions stored thereon to support graphics processing; the instructions when executed cause one or more processor to: obtain, from a central processing unit (CPU), a plurality of graphics commands; generate, based on the plurality of graphics commands, a plurality of micro-commands; determine dependency between the plurality of micro-commands; create an execution graph based on the determined dependencies, wherein each micro-command is connected by an edge to another micro-command that it depends on; define a wait count for each micro-command within the execution graph, wherein the wait count for a particular micro-command is a number of micro-commands that the particular micro-command depends on; and transmit one or more micro-commands with the wait count of zero to a ready queue for processing. 16 . The non-transitory computer readable medium of claim 15 , wherein the instructions to cause the one or more processers to determine, further comprises instructions to cause one or more processor to: register the plurality of micro-commands in a data structure, wherein the data structure tracks the dependency between the plurality of micro-commands. 17 . The non-transitory computer readable medium of claim 16 , wherein the data structure comprises a first hash table having an entry for each micro-command stored in the execution graph, each entry including a tuple identifying all of the micro-commands that depend on the entry's corresponding micro-command. 18 . The non-transitory computer readable medium of claim 15 , further comprises instructions to cause the one or more processers to: determine, for the plurality of micro-commands, one or more priority categories, wherein each of the micro-commands of the plurality of micro-commands is associated with one of the one or more priority categories. 19 . The non-transitory computer readable medium of claim 18 , further comprises instructions to cause the one or more processers to: receive a priority policy to define priorities of the one or more priority categories, wherein the execution graph is further created based on the priority policy. 20 . The non-transitory computer readable medium of claim 15 , further comprises instructions to cause the one or more processers to: update the wait count for a first group of micro-commands within the execution graph after completion of processing of one or more transmitted micro-commands that the first group of micro-commands depend upon.
Precedence · CPC title
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
Graphics controllers · CPC title
Shading · CPC title
General purpose rendering architectures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.