What technology area does this patent fall under?

Primary CPC classification G06T1/20. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 09 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Dynamic wave pairing

US11954758B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11954758-B2
Application number	US-202217652478-A
Country	US
Kind code	B2
Filing date	Feb 24, 2022
Priority date	Feb 24, 2022
Publication date	Apr 9, 2024
Grant date	Apr 9, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for dynamic wave pairing. A graphics processor may allocate one or more GPU workloads to one or more wave slots of a plurality of wave slots. The graphics processor may select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads. The selection may be based on one of a plurality of granularities. The graphics processor may execute, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus for graphics processing, comprising: a memory; and at least one processor coupled to the memory and configured to: allocate one or more graphics processing unit (GPU) workloads to one or more wave slots of a plurality of wave slots; select one of a plurality of granularities based on whether the one or more GPU workloads are associated with or are unassociated with a texture block or a branching block: select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads, the selection of the first execution slot being based on the one of the plurality of granularities; and execute, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities. 2. The apparatus of claim 1 , wherein the plurality of wave slots comprise a plurality of first wave slots and a plurality of second wave slots, the one or more GPU workloads are allocated based on the plurality of first wave slots and the plurality of second wave slots, and a first GPU workload of the one or more GPU workloads is allocated to a first wave slot of the plurality of first wave slots and a second GPU workload of the one or more GPU workloads is allocated to a second wave slot of the plurality of second wave slots. 3. The apparatus of claim 2 , wherein the first GPU workload and the second GPU workload are consecutive in sequence. 4. The apparatus of claim 2 , wherein the first GPU workload and the second GPU workload are associated with a same instruction. 5. The apparatus of claim 1 , the at least one processor being further configured to: receive, from at least one GPU component at runtime, the one or more GPU workloads. 6. The apparatus of claim 1 , wherein a first granularity is selected as the one of the plurality of granularities when the one or more GPU workloads are associated with the texture block or the branching block. 7. The apparatus of claim 6 , wherein a second granularity is selected as the one of the plurality of granularities when the one or more GPU workloads are unassociated with the texture block or the branching block, the first granularity being smaller than the second granularity. 8. The apparatus of claim 1 , the at least one processor being further configured to: copy a context register for the one or more GPU workloads to the first execution slot, wherein the context register for the one or more GPU workloads is associated with a context state of the one or more GPU workloads. 9. The apparatus of claim 1 , the at least one processor being further configured to: copy a context register for the one or more GPU workloads to the one or more wave slots. 10. The apparatus of claim 1 , wherein each GPU workload of the one or more GPU workloads includes a uniform number of fibers. 11. The apparatus of claim 1 , wherein each wave slot of the plurality of wave slots is associated with one set of general purpose registers (GPRs) of a plurality of GPRs. 12. The apparatus of claim 1 , wherein the apparatus is a wireless communication device. 13. A method of graphics processing, comprising: allocating one or more graphics processing unit (GPU) workloads to one or more wave slots of a plurality of wave slots; selecting one of a plurality of granularities based on whether the one or more GPU workloads are associated with or are unassociated with a texture block or a branching block; selecting a first execution slot of a plurality of execution slots for executing the one or more GPU workloads, the selection of the first execution slot being based on the one of the plurality of granularities; and executing, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities. 14. The method of claim 13 , wherein the plurality of wave slots comprise a plurality of first wave slots and a plurality of second wave slots, the one or more GPU workloads are allocated based on the plurality of first wave slots and the plurality of second wave slots, and a first GPU workload of the one or more GPU workloads is allocated to a first wave slot of the plurality of first wave slots and a second GPU workload of the one or more GPU workloads is allocated to a second wave slot of the plurality of second wave slots. 15. The method of claim 14 , wherein the first GPU workload and the second GPU workload are consecutive in sequence. 16. The method of claim 14 , wherein the first GPU workload and the second GPU workload are associated with a same instruction. 17. The method of claim 13 , further comprising: receiving, from at least one GPU component, the one or more GPU workloads. 18. The method of claim 13 , wherein a first granularity is selected as the one of the plurality of granularities when the one or more GPU workloads are associated with the texture block or the branching block. 19. The method of claim 18 , wherein a second granularity is selected as the one of the plurality of granularities when the one or more GPU workloads are unassociated with the texture block or the branching block, the first granularity being smaller than the second granularity. 20. The method of claim 13 , further comprising: copying a context register for the one or more GPU workloads to the first execution slot, wherein the context register for the one or more GPU workloads is associated with a context state of the one or more GPU workloads. 21. The method of claim 13 , further comprising: copying a context register for the one or more GPU workloads to the one or more wave slots. 22. The method of claim 13 , wherein each GPU workload of the one or more GPU workloads includes a uniform number of fibers. 23. The method of claim 13 , wherein each wave slot of the plurality of wave slots is associated with one set of general purpose registers (GPRs) of a plurality of GPRs. 24. A non-transitory computer-readable medium storing computer executable code, the code when executed by at least one processor, causes the at least one processor to: allocate one or more graphics processing unit (GPU) workloads to one or more wave slots of a plurality of wave slots; select one of a plurality of granularities based on whether the one or more GPU workloads are associated with or are unassociated with a texture block or a branching block; select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads, the selection of the first execution slot being based on the one of the plurality of granularities; and execute, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities. 25. The non-transitory computer-readable medium of claim 24 , wherein the plurality of wave slots comprise a plurality of first wave slots and a plurality of second wave slots, the one or more GPU workloads are allocated based on the plurality of first wave slots and the plurality of second wave slots, and a first GPU workload of the one or more GPU workloads is allocated to a first wave slot of the plurality of first wave slots and a second GPU workload of the one or more GPU workloads is allocated to a second wave slot of the plurality of second wave slots. 26. The non-transitory computer-readable medium of claim 25 , wherein the first GPU workload and the second GPU workload are consecutive in sequence. 27. The

Assignees

Qualcomm Inc

Inventors

Classifications

G06T15/005
General purpose rendering architectures · CPC title
G06T2200/28
involving image processing hardware · CPC title
G06F9/30127
Register windows · CPC title
G06F9/3836
Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title
G06F9/505
considering the load · CPC title

Patent family

Related publications grouped by family.

View patent family 85685395

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11954758B2 cover?: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for dynamic wave pairing. A graphics processor may allocate one or more GPU workloads to one or more wave slots of a plurality of wave slots. The graphics processor may select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads…
Who is the assignee on this patent?: Qualcomm Inc
What technology area does this patent fall under?: Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 09 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

User-configurable memory allocation

Deferred GPR allocation for texture/load instruction block

Efficient data sharing for graphics data processing operations

Methods and apparatus for wave slot management

General purpose register and wave slot allocation in graphics processing

Software-controlled variable wavefront size execution at gpu

Frequently asked questions