Systems and methods for synchronization of multi-thread lanes

US12223353B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12223353-B2
Application numberUS-202318481489-A
CountryUS
Kind codeB2
Filing dateOct 5, 2023
Priority dateMar 15, 2019
Publication dateFeb 11, 2025
Grant dateFeb 11, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Apparatuses to synchronize lanes that diverge or threads that drift are disclosed. In one embodiment, a graphics multiprocessor includes a queue having an initial state of groups with a first group having threads of first and second instruction types and a second group having threads of the first and second instruction types. A regroup engine (or regroup circuitry) regroups threads into a third group having threads of the first instruction type and a fourth group having threads of the second instruction type.

First claim

Opening claim text (preview).

What is claimed is: 1. A graphics multiprocessor, comprising: a queue having an initial state of groups with a first group having threads of first and second instruction types and a second group having threads of the first and second instruction types; and a regroup circuitry to regroup threads from the initial state of groups of first and second groups into a regrouped state of groups including a third group having threads of the first instruction type and a fourth group having threads of the second instruction type based on an instruction type and to determine an order of inserting the third group and the fourth group into the queue to minimize divergence between threads. 2. The graphics multiprocessor of claim 1 , wherein the regroup circuitry to select one or more threads from one or more groups that are set to execute an instruction and combine the one or more threads into a single group. 3. The graphics multiprocessor of claim 1 , wherein each of the first instruction type and the second instruction type comprise one of a load/store instruction, an integer instruction, a floating point instruction, an integer mac instruction, an integer add instruction, a floating point add instruction, a floating point fused multiply-add (fma) instruction, a floating point sine instruction, or a floating point cosine instruction. 4. The graphics multiprocessor of claim 1 , further comprising: a thread scheduler coupled to the queue; and a plurality of processing resources coupled to the thread scheduler. 5. The graphics multiprocessor of claim 4 , wherein the thread scheduler is configured to schedule the first instruction type of the third group for execution on a first processing resource with full utilization of this first processing resource. 6. The graphics multiprocessor of claim 4 , wherein the thread scheduler is configured to schedule the second instruction type of the fourth group for execution on a second processing resource with full utilization of this second processing resource. 7. The graphics multiprocessor of claim 1 , wherein the regroup circuitry utilizes regrouping policies and an order that a new regrouped group is inserted in the queue is optimized depending on latencies. 8. A graphics processor, comprising: one or more processing resources to process groupings of threads; and thread control circuitry coupled to the one or more processing resources, the thread control circuitry is configured to determine groupings of instantiated threads, to determine progress of the threads for executing a task on the one or more processing resources, and to determine drift between threads. 9. The graphics processor of claim 8 , wherein the thread control circuitry is further configured to determine whether a drift between threads exceeds a threshold drift. 10. The graphics processor of claim 8 , wherein the thread control circuitry is further configured to accelerate at least one thread that lags other threads by at least the threshold drift. 11. The graphics processor of claim 8 , wherein the thread control circuitry accelerates the at least one thread by applying a higher priority to this at least one thread than other threads. 12. The graphics processor of claim 8 , wherein the processing resources to process threads for a single instruction multiple data (SIMD) execution model. 13. A method for scheduling optimization of threads of a graphics processing unit, a graphics multiprocessor, or a graphic processor, the method comprising: starting processing of groupings of threads on one or more processing resources; monitoring, with a thread control circuitry of the graphics processing unit, the graphics multiprocessor, or the graphic processor, progress of each thread for a group; and determining, with the thread control circuitry, drift between threads. 14. The method of claim 13 , further comprising: determining, with the graphics processing unit, the graphics multiprocessor, or the graphic processor, the groupings of threads. 15. The method of claim 13 , further comprising: determining, with the thread control circuitry, whether a drift between threads exceeds a threshold drift. 16. The method of claim 13 , further comprising: rescheduling, with the thread control circuitry, at least one thread that lags other threads. 17. The method of claim 13 , wherein the thread control circuitry provides a higher priority level for the at least one thread that lags other threads to reschedule the at least one thread.

Assignees

Inventors

Classifications

  • G06F9/3851Primary

    from multiple instruction streams, e.g. multistreaming · CPC title

  • controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

  • Concurrent instruction execution, e.g. pipeline or look ahead · CPC title

  • G06T1/20Primary

    Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • single instruction multiple data [SIMD] multiprocessors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12223353B2 cover?
Apparatuses to synchronize lanes that diverge or threads that drift are disclosed. In one embodiment, a graphics multiprocessor includes a queue having an initial state of groups with a first group having threads of first and second instruction types and a second group having threads of the first and second instruction types. A regroup engine (or regroup circuitry) regroups threads into a third…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/3851. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 11 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).