Control flow mechanism for execution of graphics processor instructions using active channel packing

US2021286626A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021286626-A1
Application numberUS-202117213453-A
CountryUS
Kind codeA1
Filing dateMar 26, 2021
Priority dateApr 21, 2017
Publication dateSep 16, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus to facilitate control flow in a graphics processing system is disclosed. The apparatus includes logic a plurality of execution units to execute single instruction, multiple data (SIMD) and flow control logic to detect a diverging control flow in a plurality of SIMD channels and reduce the execution of the control flow to a subset of the SIMD channels.

First claim

Opening claim text (preview).

1 . An apparatus comprising: a graphics processor, including: a plurality of processing resources to execute pipelined instructions using a plurality of channels; and flow control circuitry to: detect that a number of active channels is below a predetermined threshold percentage of the plurality of channels; identify a code region impacted by diverging control flow; duplicate the code region with the number of active channels; pack the input of the active channels implemented by the code region; and unpack the output of the active channels produced by the code region. 2 . (canceled) 3 . (canceled) 4 . The apparatus of claim 1 , wherein the flow control circuitry is to pack input to the active channels into a subset of the plurality of channels responsive to a determination that the number of active channels is below the predetermined threshold percentage. 5 . The apparatus of claim 4 , wherein the flow control circuitry is to detect the active channels detects whether the active channels are spread over multiple sections of the plurality of processing resources. 6 . The apparatus of claim 5 , wherein the flow control circuitry prevents packing input to the active channels into the subset of the channels upon detecting that the active channels are not spread over the multiple sections. 7 . The apparatus of claim 1 , wherein the flow control circuitry is to duplicate the identified code region within the subset of the SAID channels. 8 . The apparatus of claim 7 , wherein the subset of the channels comprise half of the channels. 9 . The apparatus of claim 4 , further comprising a register file including: one or more bit registers; and one or more byte registers including a plurality of registers. 10 . The apparatus of claim 9 , wherein the flow control circuitry is to locate indices of register bits that are set in a bit register and write the indices to a bytes register. 11 . The apparatus of claim 4 , wherein the flow control circuitry is further to unpack output from the subset of the channels into the plurality of channels. 12 . The apparatus of claim 1 , wherein the flow control circuitry is further to perform consecutive channel execution upon detecting the diverging control flow. 13 . The apparatus of claim 1 , wherein the flow control circuitry is further to: detect shader branch instructions; and reconfigure hardware resources upon detection of the shader branch instructions. 14 . The apparatus of claim 13 , wherein the flow control circuitry is further to inject instructions into profile branch directions based on statistical sampling. 15 . A method comprising: detecting that a number of active channels is below a predetermined threshold percentage of a plurality of channels of processing resources of a graphics processor; identifying a code region impacted by diverging control flow; duplicating the code region with the number of active channels; packing the input of the active channels implemented by the code region; and unpacking the output of the active channels produced by the code region. 16 . (canceled) 17 . The method of claim 15 , further comprising: detecting whether the active channels are spread over multiple sections of the processing resources; and preventing packing input to the active channels into a subset of the channels responsive to detecting that the active channels are not spread over the multiple sections. 18 . The method of claim 17 , wherein unpacking the output further comprises unpacking the output from the subset of the channels into the plurality of channels. 19 . A non-transitory computer readable medium having instructions, which when executed by one or more processors, cause the processors to: detect that a number of active channels is below a predetermined threshold percentage of a plurality of channels of processing resources of a graphics processor of the one or more processors; identify a code region impacted by diverging control flow; duplicate the code region with the number of active channels; pack the input of the active channels implemented by the code region; and unpack the output of the active channels produced by the code region. 20 . The non-transitory computer readable medium of claim 19 , having instructions, which when executed by the one or more processors, further causes the processors to: detect whether the active channels are spread over multiple sections of the processing resources; and prevent packing input to the active channels into a subset of the channels on responsive to detecting that the active SAID channels are not spread over the multiple sections. 21 . The non-transitory computer readable medium of claim 19 , wherein unpacking the output further comprises unpacking the output from the subset of the channels into the plurality of channels. 22 . The non-transitory computer readable medium of claim 19 , having instructions, which when executed by the one or more processors, further causes the processors to: detect shader branch instructions; and reconfigure hardware resources upon detection of the shader branch instructions. 23 . The method of claim 15 , further comprising: detecting shader branch instructions; and reconfiguring hardware resources upon detection of the shader branch instructions.

Assignees

Inventors

Classifications

  • G06T1/20Primary

    Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • G06F9/3887Primary

    controlled by a single instruction for multiple data lanes [SIMD] · CPC title

  • Conditional branch instructions · CPC title

  • controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

  • for indirect branch instructions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021286626A1 cover?
An apparatus to facilitate control flow in a graphics processing system is disclosed. The apparatus includes logic a plurality of execution units to execute single instruction, multiple data (SIMD) and flow control logic to detect a diverging control flow in a plurality of SIMD channels and reduce the execution of the control flow to a subset of the SIMD channels.
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Sep 16 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).