Hang detection for virtualized accelerated processing device

US11182186B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11182186-B2
Application numberUS-201715663499-A
CountryUS
Kind codeB2
Filing dateJul 28, 2017
Priority dateJul 12, 2017
Publication dateNov 23, 2021
Grant dateNov 23, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A technique for recovering from a hang in a virtualized accelerated processing device (“APD”) is provided. In the virtualization scheme, different virtual machines are assigned different “time-slices” in which to use the APD. When a time-slice expires, the APD stops operations for a current VM and starts operations for another VM. To stop operations on the APD, a virtualization scheduler sends a request to idle the APD. The APD responds by completing work and idling. If one or more portions of the APD do not complete this idling process before a timeout expires, then a hang occurs. In response to the hang, the virtualization scheduler informs the hypervisor that a hang has occurred. The hypervisor performs a function level reset on the APD and informs the VM that the hang has occurred. The VM responds by stopping command issue to the APD and re-initializing the APD for the function.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for recovering from a hang in a virtualized accelerated processing device (“APD”), the method comprising: after a first draw call has ended and before a second draw call has begun, issuing, by a virtualization scheduler that is external to a hypervisor configured to support a first virtual machine associated with a current function, a first request to stop operations on the APD for the current function of the APD; determining, by the virtualization scheduler, that operations of the APD have not stopped after a timeout period has elapsed since issuing the first request to stop; responsive to the determining, issuing a first hang interrupt signal to the hypervisor; receiving, from a handler, an instruction to reset the current function, wherein the hypervisor forwards the first hang interrupt signal to a virtualization driver, wherein the handler is executed by the virtualization driver in response to receiving the first hang interrupt signal; and in response to the instruction to reset the current function, resetting, by the APD, the current function. 2. The method of claim 1 , further comprising: after resetting the APD for the current function, initializing the current function at the direction of the first virtual machine. 3. The method of claim 1 , wherein: issuing the first request to stop operations on the APD for the current function is performed in response to determining that a virtualization context switch is to occur. 4. The method of claim 3 , wherein determining that the virtualization context switch is to occur comprises determining that a time-slice assigned to the current function has elapsed. 5. The method of claim 1 , wherein: the first request to stop operations on the APD for the current function comprises a request to complete work for the current function and to idle processing elements of the APD after the work is completed. 6. The method of claim 1 , wherein resetting the APD for the current function comprises: placing the current function into a state in which the current function is ready to be initialized. 7. The method of claim 6 , wherein resetting the APD for the current function further comprises: forcing operations for the current function on the APD to stop, and clearing state for the current function. 8. The method of claim 1 , further comprising: responsive to receiving the first hang interrupt signal, issuing, by the hypervisor, a hang notification to the first virtual machine. 9. The method of claim 8 , further comprising: stopping issuing commands to the APD, by the first virtual machine, responsive to receiving the hang notification. 10. A device, comprising: a processor configured to execute a plurality of virtual machines and a hypervisor configured to support a first virtual machine of the plurality of virtual machines associated with a current function; and a virtualized accelerated processing device (“APD”) in communication with the processor, the virtualized APD configured to: support one or more functions, the functions corresponding to different virtual machines of the plurality of virtual machines executed on the processor; after a first draw call has ended and before a second draw call has begun, issue, by a virtualization scheduler of the APD, a first request to stop operations on the APD for the current function of the APD; determine, by the virtualization scheduler, that operations of the APD have not stopped after a timeout period has elapsed since issuing the first request to stop; and responsive to the determining, issue a first hang interrupt signal to the hypervisor; receive, from a handler, an instruction to reset the current function, wherein the hypervisor forwards the first hang interrupt signal to a virtualization driver, wherein the handler is executed by the virtualization driver in response to receiving the first hang interrupt signal; and in response to the instruction to reset the current function, reset, by the APD, the current function. 11. The device of claim 10 , wherein the APD is further configured to: initialize the current function at the direction of the first virtual machine after resetting the APD for the current function. 12. The device of claim 10 , wherein: the APD is configured to issue the first request to stop operations on the APD for the current function in response to determining that a virtualization context switch is to occur. 13. The device of claim 12 , wherein: the APD is configured to determine that the virtualization context switch is to occur by determining that a time-slice assigned to the current function has elapsed. 14. The device of claim 10 , wherein: the first request to stop operations on the APD for the current function comprises a request to complete work for the current function and to idle processing elements of the APD after the work is completed. 15. The device of claim 10 , wherein the APD is configured to reset the APD for the current function by: placing the current function into a state in which the current function is ready to be initialized. 16. The device of claim 15 , wherein the APD is further configured to reset the APD for the current function by: forcing operations for the current function on the APD to stop, and clearing state for the current function. 17. The device of claim 10 , wherein: the processor is configured to execute the hypervisor that is configured to, responsive to receiving the first hang interrupt signal, issue a hang notification to the first virtual machine. 18. The device of claim 17 , wherein the first virtual machine is further configured to: stop issuing commands to the APD responsive to receiving the hang notification.

Assignees

Inventors

Classifications

  • Task life-cycle, e.g. stopping, restarting, resuming execution (G06F9/4881 takes precedence) · CPC title

  • Hypervisor-specific management and integration aspects · CPC title

  • Hypervisors; Virtual machine monitors · CPC title

  • Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators · CPC title

  • by interrupt, e.g. masked · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11182186B2 cover?
A technique for recovering from a hang in a virtualized accelerated processing device (“APD”) is provided. In the virtualization scheme, different virtual machines are assigned different “time-slices” in which to use the APD. When a time-slice expires, the APD stops operations for a current VM and starts operations for another VM. To stop operations on the APD, a virtualization scheduler sends …
Who is the assignee on this patent?
Advanced Micro Devices Inc, Ati Technologies Ulc
What technology area does this patent fall under?
Primary CPC classification G06F9/45558. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 23 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).