Dynamic wave pairing

US11954758B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11954758-B2
Application numberUS-202217652478-A
CountryUS
Kind codeB2
Filing dateFeb 24, 2022
Priority dateFeb 24, 2022
Publication dateApr 9, 2024
Grant dateApr 9, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for dynamic wave pairing. A graphics processor may allocate one or more GPU workloads to one or more wave slots of a plurality of wave slots. The graphics processor may select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads. The selection may be based on one of a plurality of granularities. The graphics processor may execute, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus for graphics processing, comprising: a memory; and at least one processor coupled to the memory and configured to: allocate one or more graphics processing unit (GPU) workloads to one or more wave slots of a plurality of wave slots; select one of a plurality of granularities based on whether the one or more GPU workloads are associated with or are unassociated with a texture block or a branching block: select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads, the selection of the first execution slot being based on the one of the plurality of granularities; and execute, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities. 2. The apparatus of claim 1 , wherein the plurality of wave slots comprise a plurality of first wave slots and a plurality of second wave slots, the one or more GPU workloads are allocated based on the plurality of first wave slots and the plurality of second wave slots, and a first GPU workload of the one or more GPU workloads is allocated to a first wave slot of the plurality of first wave slots and a second GPU workload of the one or more GPU workloads is allocated to a second wave slot of the plurality of second wave slots. 3. The apparatus of claim 2 , wherein the first GPU workload and the second GPU workload are consecutive in sequence. 4. The apparatus of claim 2 , wherein the first GPU workload and the second GPU workload are associated with a same instruction. 5. The apparatus of claim 1 , the at least one processor being further configured to: receive, from at least one GPU component at runtime, the one or more GPU workloads. 6. The apparatus of claim 1 , wherein a first granularity is selected as the one of the plurality of granularities when the one or more GPU workloads are associated with the texture block or the branching block. 7. The apparatus of claim 6 , wherein a second granularity is selected as the one of the plurality of granularities when the one or more GPU workloads are unassociated with the texture block or the branching block, the first granularity being smaller than the second granularity. 8. The apparatus of claim 1 , the at least one processor being further configured to: copy a context register for the one or more GPU workloads to the first execution slot, wherein the context register for the one or more GPU workloads is associated with a context state of the one or more GPU workloads. 9. The apparatus of claim 1 , the at least one processor being further configured to: copy a context register for the one or more GPU workloads to the one or more wave slots. 10. The apparatus of claim 1 , wherein each GPU workload of the one or more GPU workloads includes a uniform number of fibers. 11. The apparatus of claim 1 , wherein each wave slot of the plurality of wave slots is associated with one set of general purpose registers (GPRs) of a plurality of GPRs. 12. The apparatus of claim 1 , wherein the apparatus is a wireless communication device. 13. A method of graphics processing, comprising: allocating one or more graphics processing unit (GPU) workloads to one or more wave slots of a plurality of wave slots; selecting one of a plurality of granularities based on whether the one or more GPU workloads are associated with or are unassociated with a texture block or a branching block; selecting a first execution slot of a plurality of execution slots for executing the one or more GPU workloads, the selection of the first execution slot being based on the one of the plurality of granularities; and executing, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities. 14. The method of claim 13 , wherein the plurality of wave slots comprise a plurality of first wave slots and a plurality of second wave slots, the one or more GPU workloads are allocated based on the plurality of first wave slots and the plurality of second wave slots, and a first GPU workload of the one or more GPU workloads is allocated to a first wave slot of the plurality of first wave slots and a second GPU workload of the one or more GPU workloads is allocated to a second wave slot of the plurality of second wave slots. 15. The method of claim 14 , wherein the first GPU workload and the second GPU workload are consecutive in sequence. 16. The method of claim 14 , wherein the first GPU workload and the second GPU workload are associated with a same instruction. 17. The method of claim 13 , further comprising: receiving, from at least one GPU component, the one or more GPU workloads. 18. The method of claim 13 , wherein a first granularity is selected as the one of the plurality of granularities when the one or more GPU workloads are associated with the texture block or the branching block. 19. The method of claim 18 , wherein a second granularity is selected as the one of the plurality of granularities when the one or more GPU workloads are unassociated with the texture block or the branching block, the first granularity being smaller than the second granularity. 20. The method of claim 13 , further comprising: copying a context register for the one or more GPU workloads to the first execution slot, wherein the context register for the one or more GPU workloads is associated with a context state of the one or more GPU workloads. 21. The method of claim 13 , further comprising: copying a context register for the one or more GPU workloads to the one or more wave slots. 22. The method of claim 13 , wherein each GPU workload of the one or more GPU workloads includes a uniform number of fibers. 23. The method of claim 13 , wherein each wave slot of the plurality of wave slots is associated with one set of general purpose registers (GPRs) of a plurality of GPRs. 24. A non-transitory computer-readable medium storing computer executable code, the code when executed by at least one processor, causes the at least one processor to: allocate one or more graphics processing unit (GPU) workloads to one or more wave slots of a plurality of wave slots; select one of a plurality of granularities based on whether the one or more GPU workloads are associated with or are unassociated with a texture block or a branching block; select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads, the selection of the first execution slot being based on the one of the plurality of granularities; and execute, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities. 25. The non-transitory computer-readable medium of claim 24 , wherein the plurality of wave slots comprise a plurality of first wave slots and a plurality of second wave slots, the one or more GPU workloads are allocated based on the plurality of first wave slots and the plurality of second wave slots, and a first GPU workload of the one or more GPU workloads is allocated to a first wave slot of the plurality of first wave slots and a second GPU workload of the one or more GPU workloads is allocated to a second wave slot of the plurality of second wave slots. 26. The non-transitory computer-readable medium of claim 25 , wherein the first GPU workload and the second GPU workload are consecutive in sequence. 27. The

Assignees

Inventors

Classifications

  • General purpose rendering architectures · CPC title

  • involving image processing hardware · CPC title

  • Register windows · CPC title

  • Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title

  • considering the load · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11954758B2 cover?
This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for dynamic wave pairing. A graphics processor may allocate one or more GPU workloads to one or more wave slots of a plurality of wave slots. The graphics processor may select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads…
Who is the assignee on this patent?
Qualcomm Inc
What technology area does this patent fall under?
Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 09 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).