Efficient thread group scheduling

US2018293102A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2018293102-A1
Application numberUS-201715482801-A
CountryUS
Kind codeA1
Filing dateApr 9, 2017
Priority dateApr 9, 2017
Publication dateOct 11, 2018
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A mechanism is described for facilitating intelligent thread scheduling at autonomous machines. A method of embodiments, as described herein, includes detecting dependency information relating to a plurality of threads corresponding to a plurality of workloads associated with tasks relating to a processor including a graphics processor. The method may further include generating a tree of thread groups based on the dependency information, where each thread group includes multiple threads, and scheduling one or more of the thread groups associated a similar dependency to avoid dependency conflicts.

First claim

Opening claim text (preview).

What is claimed is: 1 . An apparatus comprising: detection/observation logic, as facilitated by or at least partially implemented in a processor, to detect dependency information relating to a plurality of threads corresponding to a plurality of workloads associated with tasks relating to the processor including a graphics processor; and thread dependency logic, as facilitated by or at least partially implemented in a processor, to generate a tree of thread groups based on the dependency information, wherein each thread group includes multiple threads; and a scheduler, as facilitated by or at least partially implemented in a processor, to schedule one or more of the thread groups associated a similar dependency to avoid dependency conflicts. 2 . The apparatus of claim 1 , wherein the tree comprises multiple nodes, wherein each node represents a thread group or a thread. 3 . The apparatus of claim 1 , further comprising partial application preemption logic, as facilitated by or at least partially implemented in a processor, to suspend one or more thread groups upon encountering a condition, wherein the one or more threads to store one or more sets of context information relating to the condition, wherein the partial preemption logic is further to facilitate dispatching of anther thread group while the one or more thread groups remain suspended. 4 . The apparatus of claim 3 , wherein the partial application preemption logic is further to resume processing of the one or more thread groups upon satisfying the condition and using the one or more sets of context information. 5 . The apparatus of claim 1 , further comprising multi-layer processing logic, as facilitated by or at least partially implemented in a processor, to facilitate processing of the plurality of thread groups using multiple processing layers of the graphics processor, wherein each processing layer includes one or more streaming multiprocessors. 6 . The apparatus of claim 5 , further comprising prioritization logic, as facilitated by or at least partially implemented in a processor, to prioritize a first thread group of the plurality of thread groups over a second thread group of the plurality of thread groups based on priority of a first task associated with the first thread group being superior to a second task associated with the second thread group, wherein the tasks include the first and second tasks. 7 . The apparatus of claim 1 , wherein the graphics processor is co-located with an application processor on a common semiconductor package. 8 . A method comprising: detecting dependency information relating to a plurality of threads corresponding to a plurality of workloads associated with tasks relating to a processor including a graphics processor; generating a tree of thread groups based on the dependency information, wherein each thread group includes multiple threads; and scheduling one or more of the thread groups associated a similar dependency to avoid dependency conflicts. 9 . The method of claim 8 , wherein the tree comprises multiple nodes, wherein each node represents a thread group or a thread. 10 . The method of claim 8 , further comprising: suspending one or more thread groups upon encountering a condition, wherein the one or more threads to store one or more sets of context information relating to the condition; and facilitating dispatching of anther thread group while the one or more thread groups remain suspended. 11 . The method of claim 10 , further comprising resuming processing of the one or more thread groups upon satisfying the condition and using the one or more sets of context information. 12 . The method of claim 8 , further comprising facilitating processing of the plurality of thread groups using multiple processing layers of the graphics processor, wherein each processing layer includes one or more streaming multiprocessors. 13 . The method of claim 12 , further comprising prioritizing a first thread group of the plurality of thread groups over a second thread group of the plurality of thread groups based on priority of a first task associated with the first thread group being superior to a second task associated with the second thread group, wherein the tasks include the first and second tasks. 14 . The method of claim 8 , wherein the graphics processor is co-located with an application processor on a common semiconductor package. 15 . At least one machine-readable medium comprising instructions that when executed by a computing device, cause the computing device to perform operations comprising: detecting dependency information relating to a plurality of threads corresponding to a plurality of workloads associated with tasks relating to a processor including a graphics processor; generating a tree of thread groups based on the dependency information, wherein each thread group includes multiple threads; and scheduling one or more of the thread groups associated a similar dependency to avoid dependency conflicts. 16 . The machine-readable medium of claim 15 , wherein the tree comprises multiple nodes, wherein each node represents a thread group or a thread. 17 . The machine-readable medium of claim 15 , wherein the operations further comprise: suspending one or more thread groups upon encountering a condition, wherein the one or more threads to store one or more sets of context information relating to the condition; and facilitating dispatching of anther thread group while the one or more thread groups remain suspended. 18 . The machine-readable medium of claim 17 , wherein the operations further comprise resuming processing of the one or more thread groups upon satisfying the condition and using the one or more sets of context information. 19 . The machine-readable medium of claim 15 , wherein the operations further comprise facilitating processing of the plurality of thread groups using multiple processing layers of the graphics processor, wherein each processing layer includes one or more streaming multiprocessors. 20 . The machine-readable medium of claim 19 , wherein the operations further comprise prioritizing a first thread group of the plurality of thread groups over a second thread group of the plurality of thread groups based on priority of a first task associated with the first thread group being superior to a second task associated with the second thread group, wherein the tasks include the first and second tasks, wherein the graphics processor is co-located with an application processor on a common semiconductor package.

Assignees

Inventors

Classifications

  • considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title

  • Program synchronisation; Mutual exclusion, e.g. by means of semaphores · CPC title

  • Task life-cycle, e.g. stopping, restarting, resuming execution (G06F9/4881 takes precedence) · CPC title

  • Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • G06F9/4881Primary

    Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018293102A1 cover?
A mechanism is described for facilitating intelligent thread scheduling at autonomous machines. A method of embodiments, as described herein, includes detecting dependency information relating to a plurality of threads corresponding to a plurality of workloads associated with tasks relating to a processor including a graphics processor. The method may further include generating a tree of thread…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/4881. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 11 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).