Systems, methods, and computer program products for preemption of threads at a synchronization barrier

US9652300B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9652300-B2
Application numberUS-201213535976-A
CountryUS
Kind codeB2
Filing dateJun 28, 2012
Priority dateJun 28, 2012
Publication dateMay 16, 2017
Grant dateMay 16, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for the processing of EU threads (also known as warps) in a thread group. The status of each EU thread in the group may be monitored, to determine if it is executing or if it is halted and waiting at a synchronization barrier. If certain threshold conditions are met, the waiting EU threads may be preempted to allow execution of threads from another thread group. The threshold conditions may include a minimum number of EUs in use, a minimum number of EU threads in the first thread group that are waiting at the synchronization barrier and/or a maximum number of EU threads that are still executing, and a minimum wait time for one or more of the EU threads waiting at the barrier.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: processing a first thread group over a set of single instruction multiple data (SIMD) execution units of a SIMD processor; idling threads of the first thread group as the respective threads reach a synchronization barrier; and selectively processing a second thread group on a first subset of the set of SIMD execution units for which threads of the first group are idled while one or more non-idle threads of the first thread group continue executing on a second subset of the set of SIMD execution units, based on a threshold related to one or more of demand for SIMD execution units of the SIMD processor and availability of the SIMD execution units of the SIMD processor; wherein the selectively processing includes, setting the threshold dynamically based on a context of the first thread group, and processing the second thread group on the first subset of the set of SIMD execution units if a difference between a number of SIMD execution units of the SIMD processor and a number of threads of the first thread group exceeds a minimum threshold difference. 2. The method of claim 1 , wherein the selectively processing further includes: processing the second thread group on the first subset of the set of SIMD execution units if a number of threads of the first thread group exceeds a minimum threshold. 3. The method of claim 1 , wherein the selectively processing further includes: processing the second thread group on the first subset of the subset of SIMD execution units if a number of non-idle threads of the first thread group exceeds a minimum threshold. 4. The method of claim 1 , wherein the selectively processing further includes: processing the second thread group on the first subset of the set of SIMD execution units if a duration for which one or more threads of the first thread group is idle exceeds a minimum threshold duration. 5. The method of claim 4 , wherein the selectively processing further includes: processing the second thread group on the first subset of the set of SIMD execution units if an average duration for which multiple threads of the first thread group are idle exceeds the minimum threshold duration. 6. The method of claim 1 , wherein the selectively processing further includes one or more of: setting the threshold based further on user input; and setting the threshold with scheduling logic. 7. An apparatus, comprising, a single instruction multiple data (SIMD) processor and a thread manager, wherein the thread manager is configured to: assign a first thread group over a set of SIMD execution units of the SIMD processor; idle threads of the first thread group as the respective threads reach a synchronization barrier; and selectively assign a second thread group to a first subset of the set of SIMD execution units at which threads of the first group are idled while one or more non-idle threads of the first thread group continue executing on a second subset of the set of SIMD execution units, based on a threshold related to one or more of demand for SIMD execution units of the SIMD processor and availability of the SIMD execution units of the SIMD processor, including to, set the threshold dynamically based on a context of the first thread group; and process the second thread group on the first subset of the set of SIMD execution units if a difference between a number of SIMD execution units of the SIMD processor and a number of threads of the first thread group exceeds a minimum threshold difference. 8. The apparatus of claim 7 , wherein the thread manager is further configured to: assign the second thread group to the first subset of the set of SIMD execution units if a number of threads of the first thread group exceeds a minimum threshold. 9. The apparatus of claim 7 , wherein the thread manager is further configured to: assign the second thread group to the first subset of the set of SIMD execution units if a number of non-idle threads of the first thread group exceeds a minimum threshold. 10. The apparatus of claim 7 , wherein the thread manager is further configured to: assign the second thread group to the first subset of the set of SIMD execution units if a duration for which one or more threads of the first thread group is idle exceeds a minimum threshold duration. 11. The apparatus of claim 10 , wherein the thread manager is further configured to: assign the second thread group to the first subset of the set of SIMD execution units if an average duration for which multiple threads of the first thread group are idled exceeds the minimum threshold duration. 12. The apparatus of claim 7 , wherein the thread manager is further configured to perform one or more of: set the threshold based further on user input; and set the threshold with scheduling logic. 13. The apparatus of claim 7 , wherein the thread manager is further configured to determine when threads of the first thread group reach the synchronization barrier based on status information provided by the respective threads. 14. A non-transitory computer readable medium encoded with a computer program that includes instructions to cause a single instruction multiple data (SIMD) processor to: process a first thread group over a set of SIMD execution units of the SIMD processor; idle threads of the first thread group as the respective reach a synchronization barrier; and selectively process a second thread group on a first subset of the set of SIMD execution units for which threads of the first group are idled while one or more non-idle threads of the first thread group continue executing on a second subset of the set of SIMD execution units, based on a threshold related to one or more of demand for SIMD execution units of the SIMD processor and availability of the SIMD execution units of the SIMD processor, including to, set the threshold dynamically based on a context of the first thread group, and process the second thread group on the first subset of the set of SIMD execution units if a difference between a number of SIMD execution units of the SIMD processor and a number of threads of the first thread group exceeds a minimum threshold difference. 15. The non-transitory computer readable medium of claim 14 , further including instructions to cause the SIMD processor to: process the second thread group on the first subset of the set of SIMD execution units a number of threads of the first thread group exceeds a minimum threshold. 16. The non-transitory computer readable medium of claim 14 , further including instructions to cause the SIMD processor to: process the second thread group on the first subset of the set of SIMD execution units if a number of non-idle threads of the first thread group exceeds a minimum threshold. 17. The non-transitory computer readable medium of claim 14 , further including instructions to cause the SIMD processor to: process the second thread group on the first subset of the set of SIMD execution units if a duration for which one or more threads of the first thread group is idled exceeds a minimum threshold duration. 18. The non-transitory computer readable medium of claim 17 , further including instructions to cause the SIMD processor to: process the second thread group on the first subset of the set of SIMD execution units if an average duration for which multiple threads of the first thread group are idle exceeds the minimum threshold duration. 19. The non-transitory computer readable medium of claim 14 , further including instructi

Assignees

Inventors

Classifications

  • G06F9/522Primary

    Barrier synchronisation · CPC title

  • from multiple instruction streams, e.g. multistreaming · CPC title

  • considering the load · CPC title

  • controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9652300B2 cover?
Systems and methods for the processing of EU threads (also known as warps) in a thread group. The status of each EU thread in the group may be monitored, to determine if it is executing or if it is halted and waiting at a synchronization barrier. If certain threshold conditions are met, the waiting EU threads may be preempted to allow execution of threads from another thread group. The threshol…
Who is the assignee on this patent?
Targowski Marek, Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/522. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 16 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).