Compute cluster preemption within a general-purpose graphics processing unit

US10043232B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10043232-B1
Application numberUS-201715482809-A
CountryUS
Kind codeB1
Filing dateApr 9, 2017
Priority dateApr 9, 2017
Publication dateAug 7, 2018
Grant dateAug 7, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment provides for a general-purpose graphics processing unit comprising a compute cluster including multiple compute units, a stall notification module to detect that one or more compute units in the compute cluster are stalled and send stall notification, and a rebalance module to receive the stall notification, the rebalance module to migrate a first workload from one or more stalled compute units in response to the stall notification.

First claim

Opening claim text (preview).

What is claimed is: 1. A general-purpose graphics processing unit comprising: a compute cluster including multiple compute units, the multiple compute units having a single instruction, multiple thread architecture; a stall notification module to detect that one or more compute units in the compute cluster are stalled and send a stall notification; and a rebalance module to receive the stall notification, the rebalance module to migrate a first workload from one or more stalled compute units in response to the stall notification, wherein a compute unit of the one or more stalled compute units is to determine, for a pending pipeline event for the first workload, whether to halt the pending pipeline event or allow the pending pipeline event to complete, the compute unit to allow the pending pipeline event to complete in response to a determination that the pending pipeline event has cleared a pipeline threshold when a migration command is received, the pipeline threshold specific to a type of event. 2. The general-purpose graphics processing unit as in claim 1 , wherein the rebalance module is to determine if a second workload is pending execution and migrate the second workload to the compute cluster when the second workload is pending execution. 3. The general-purpose graphics processing unit as in claim 2 , additionally including a power module to power gate an idle compute unit within the compute cluster. 4. The general-purpose graphics processing unit as in claim 3 , wherein the rebalance module is to request the power module to power gate the compute cluster when the second workload is not pending execution. 5. The general-purpose graphics processing unit as in claim 1 , wherein the stall notification module is to maintain an activity scoreboard, the activity scoreboard to maintain an active or blocked status for each of the multiple compute units in the compute cluster. 6. The general-purpose graphics processing unit as in claim 1 , wherein to migrate the first workload from the compute unit of the one or more stalled compute units, the rebalance module is to issue a migration command to the compute unit. 7. The general-purpose graphics processing unit as in claim 6 , wherein the compute unit is to halt the pending pipeline event in response to the determination that the pending pipeline event has not cleared the pipeline threshold and flag the pending pipeline event for replay. 8. The general-purpose graphics processing unit as in claim 7 , wherein the compute unit is to allow a first pending pipeline event to complete the pipeline and halt a second pipeline event. 9. The general-purpose graphics processing unit as in claim 8 , wherein the compute unit is to save state information associated with the second pipeline event. 10. A method of compute cluster preemption on a general-purpose graphics processing unit, the method comprising: monitoring a workload executing on a compute cluster via a compute unit scoreboard, the compute cluster having multiple compute units having a single instruction, multiple thread architecture; detecting a blocked workload on the compute cluster via the compute unit scoreboard; and notifying a rebalance module that the blocked workload is blocked, the rebalance module to migrate the blocked workload from the compute cluster; and migrating the blocked workload from the compute cluster, wherein migrating the blocked workload includes evaluating a pending pipeline event associated with the blocked workload and allowing the pending pipeline event to complete before migrating the blocked workload in response to determining that the pending pipeline event has cleared a pipeline threshold, the pipeline threshold specific to a type of event. 11. The method as in claim 10 , wherein detecting the blocked workload on the compute cluster includes detecting that all compute units of the compute cluster are stalled. 12. The method as in claim 10 , additionally comprising migrating the blocked workload from the compute cluster includes halting the pending pipeline event and flagging the pending pipeline event for replay after migrating the blocked workload in response to determining that the pending pipeline event has not cleared the pipeline threshold. 13. The method as in claim 12 , wherein determining whether the pending pipeline event has cleared a pipeline threshold includes determining that the pipeline event has cleared the pipeline threshold before receiving the notification of the blocked workload. 14. The method as in claim 10 , additionally comprising querying a scheduler to determine if a pending workload is pending execution and requesting the rebalance module to launch the pending workload on the compute cluster when the pending workload is pending execution. 15. The method as in claim 14 , additionally comprising requesting a power module to power gate the compute cluster when the pending workload is not pending execution. 16. A non-transitory machine-readable medium storing instructions to cause one or more processors to perform operations comprising: monitoring a workload executing on a compute cluster via a compute unit scoreboard, the compute cluster having multiple compute units having a single instruction, multiple thread architecture; detecting a blocked workload on the compute cluster via the compute unit scoreboard; and notifying a rebalance module that the blocked workload is blocked, the rebalance module to migrate the blocked workload from the compute cluster; and migrating the blocked workload from the compute cluster, wherein migrating the blocked workload includes evaluating a pending pipeline event associated with the blocked workload and allowing the pending pipeline event to complete before migrating the blocked workload in response to determining that the pending pipeline event has cleared a pipeline threshold, the pipeline threshold specific to a type of event. 17. The non-transitory machine-readable medium as in claim 16 , wherein detecting the blocked workload on the compute cluster includes detecting that all compute units of the compute cluster are stalled. 18. The non-transitory machine-readable medium as in claim 16 , the operations additionally comprising migrating the blocked workload from the compute cluster includes halting the pending pipeline event and flagging the pending pipeline event for replay after migrating the blocked workload in response to determining that the pending pipeline event has not cleared the pipeline threshold. 19. The non-transitory machine-readable medium as in claim 18 , wherein determining whether the pending pipeline event has cleared a pipeline threshold includes determining that the pipeline event has cleared the pipeline threshold before receiving the notification of the blocked workload. 20. The non-transitory machine-readable medium as in claim 16 , the operations additionally comprising querying a scheduler to determine if a pending workload is pending execution, requesting the rebalance module to launch the pending workload on the compute cluster when the pending workload is pending execution, and requesting a power module to power gate the compute cluster when the pending workload is not pending execution.

Assignees

Inventors

Classifications

  • Techniques for rebalancing the load in a distributed system · CPC title

  • by interrupt, e.g. masked · CPC title

  • resumption being on a different machine, e.g. task migration, virtual machine migration (G06F9/5088 takes precedence) · CPC title

  • G06T1/20Primary

    Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • involving task migration · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10043232B1 cover?
One embodiment provides for a general-purpose graphics processing unit comprising a compute cluster including multiple compute units, a stall notification module to detect that one or more compute units in the compute cluster are stalled and send stall notification, and a rebalance module to receive the stall notification, the rebalance module to migrate a first workload from one or more stalle…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 07 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).