Intelligent GPU memory pre-fetching and GPU translation lookaside buffer management

US9563571B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9563571-B2
Application numberUS-201414262500-A
CountryUS
Kind codeB2
Filing dateApr 25, 2014
Priority dateApr 25, 2014
Publication dateFeb 7, 2017
Grant dateFeb 7, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and apparatus of a device that manages virtual memory for a graphics processing unit is described. In an exemplary embodiment, the device performs translation lookaside buffer coherency for a translation lookaside buffer of the graphics processing unit of the device. In this embodiment, the device receives a request to remove an entry of the translation lookaside buffer of the graphics processing unit, where the device includes a central processing unit and the graphics processing unit. In addition, the entry includes a translation of virtual memory address of a process to a physical memory address of system memory of a central processing unit and the graphics processing unit is executing a compute task of the process. The device locates the entry in the translation lookaside buffer and removes the entry.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory machine-readable medium having executable instructions to cause one or more processing units to perform a method to process a graphics processing unit page fault, the method comprising: detecting a page fault of a process associated with a first page that stores content of a memory object; determining if the page fault is associated with a graphics processing unit operation; and in response to determining that the page fault is associated with a graphics processing unit operation, analyzing the memory object for domain information of the memory object, identifying a second page that is stored in persistent storage using the domain information, pre-fetching the second page associated with the memory object into physical memory, and mapping the second page to virtual memory of the process. 2. The non-transitory machine-readable medium of claim 1 , wherein the graphics processing unit operation is selected from the group consisting of a read of a virtual memory address corresponding to a page that is not stored in physical memory and a write to a virtual memory address corresponding to a page that is not stored in physical memory. 3. The non-transitory machine-readable medium of claim 1 , wherein the domain information is structural information of the memory object. 4. The non-transitory machine-readable medium of claim 1 , wherein if the page fault is associated with a graphics processing unit operation, analyzing the memory object for historical use. 5. The non-transitory machine-readable medium of claim 1 , wherein the memory object is selected from the group consisting of an array, an image, and a texture. 6. The non-transitory machine-readable medium of claim 1 , wherein the pre-fetching comprises: allocating memory for the second page in physical memory; and loading the second page into physical memory. 7. The non-transitory machine-readable medium of claim 1 , wherein the mapping comprises: adding a page table entry for the second page to a shared page table that maps a virtual address of the second page to a physical address of the second page. 8. The non-transitory machine-readable medium of claim 1 , further comprising: analyzing a historical access information of the memory object, wherein the memory object spans a plurality of pages; and identifying the second page that is stored in persistent storage using the domain information and the historical access information. 9. A method to process a graphics processing unit page fault, the method comprising: detecting a page fault of a process associated with a first page that stores content of a memory object; determining if the page fault is associated with a graphics processing unit operation; and in response to determining that the page fault is associated with a graphics processing unit operation, analyzing the memory object for domain information of the memory object, identifying a second page that is stored in persistent storage using the domain information, pre-fetching the second page associated with the memory object into physical memory, and mapping the second page to virtual memory of the process. 10. The method of claim 9 , wherein the domain information is structural information of the memory object. 11. The method of claim 9 , wherein if the page fault is associated with a graphics processing unit operation, analyzing the memory object for historical use. 12. The method of claim 9 , wherein the pre-fetching comprises: allocating memory for the second page in physical memory; and loading the second page into physical memory. 13. The method of claim 9 , wherein the domain information is structural information of the memory object. 14. The method of claim 9 , wherein the memory object is selected from the group consisting of an array, an image, and a texture. 15. The method of claim 9 , further comprising: analyzing a historical access information of the memory object, wherein the memory object spans a plurality of pages; and identifying the second page that is stored in persistent storage using the domain information and the historical access information. 16. The device of claim 9 , wherein the process further causes the processor to analyze a historical access information of the memory object, wherein the memory object spans a plurality of pages and identify the second page that is stored in persistent storage using the domain information and the historical access information. 17. A device that tracks virtual memory access by a graphics processing unit of the device, the device comprising: a processor; a memory coupled to the processor though a bus; and a process executed from the memory by the processor that causes the processor to detect a page fault of a process associated with a first page that stores content of a memory object, determine if the page fault is associated with a graphics processing unit operation; and in response to determining that the page fault is associated with a graphics processing unit operation, analyze the memory object for domain information of the memory object, identify a second page that is stored in persistent storage using the domain information, pre-fetch the second page associated with the memory object into physical memory, and map the second page to virtual memory of the process. 18. The device of claim 17 , wherein the graphics processing unit operation is selected from the group consisting of a read of a virtual memory address corresponding to a page that is not stored in physical memory and a write to a virtual memory address corresponding to a page that is not stored in physical memory. 19. The device of claim 17 , wherein the domain information is structural information of the memory object. 20. The device of claim 17 , wherein if the page fault is associated with a graphics processing unit operation, analyzing the memory object for historical use. 21. The device of claim 17 , wherein the memory object is selected from the group consisting of an array, an image, and a texture. 22. The device of claim 17 , wherein the process further causes the processor to pre-fetch by allocating memory for the second page in physical memory and loading the second page into physical memory. 23. The device of claim 17 , wherein the process further causes the processor to map by adding a page table entry for the second page to a shared page table that maps a virtual address of the second page to a physical address of the second page.

Assignees

Inventors

Classifications

  • Virtual address space management · CPC title

  • using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] · CPC title

  • using page tables, e.g. page table structures · CPC title

  • Look-ahead translation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9563571B2 cover?
A method and apparatus of a device that manages virtual memory for a graphics processing unit is described. In an exemplary embodiment, the device performs translation lookaside buffer coherency for a translation lookaside buffer of the graphics processing unit of the device. In this embodiment, the device receives a request to remove an entry of the translation lookaside buffer of the graphics…
Who is the assignee on this patent?
Apple Inc
What technology area does this patent fall under?
Primary CPC classification G06F12/1027. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 07 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).