Replaying memory transactions while resolving memory access faults

US9575892B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9575892-B2
Application numberUS-201314109678-A
CountryUS
Kind codeB2
Filing dateDec 17, 2013
Priority dateMar 15, 2013
Publication dateFeb 21, 2017
Grant dateFeb 21, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment of the present invention is a parallel processing unit (PPU) that includes one or more streaming multiprocessors (SMs) and implements a replay unit per SM. Upon detecting a page fault associated with a memory transaction issued by a particular SM, the corresponding replay unit causes the SM, but not any unaffected SMs, to cease issuing new memory transactions. The replay unit then stores the faulting memory transaction and any faulting in-flight memory transaction in a replay buffer. As page faults are resolved, the replay unit replays the memory transactions in the replay buffer—removing successful memory transactions from the replay buffer—until all of the stored memory transactions have successfully executed. Advantageously, the overall performance of the PPU is improved compared to conventional PPUs that, upon detecting a page fault, stop performing memory transactions across all SMs included in the PPU until the fault is resolved.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for processing virtual memory transactions associated with a multithreaded processing unit, the method comprising: receiving a first virtual memory transaction from a first unit; attempting to execute the first virtual memory transaction; detecting a first page fault related to the first virtual memory transaction; storing the first virtual memory transaction in a replay buffer; causing a stall condition that inhibits the first unit from generating subsequent virtual memory transactions until the first page fault has been resolved; and once the first page fault has been resolved, re-executing the first virtual memory transaction as well as at least one other virtual memory transaction stored in the replay buffer. 2. The method of claim 1 , further comprising determining that the replay buffer is empty and enabling the first unit to generate subsequent virtual memory transactions. 3. The method of claim 1 , further comprising receiving a second virtual memory transaction from a second unit while the first page fault is unresolved, and successfully executing the second virtual memory transaction. 4. The method of claim 1 , further comprising: receiving a second virtual memory transaction from the first unit prior to detecting the first page fault; detecting a second page fault related to the second virtual memory transaction; and storing the second virtual memory transaction in the replay buffer. 5. The method of claim 1 , further comprising invalidating a translation lookaside buffer prior to re-executing the first virtual memory transaction. 6. The method of claim 1 , wherein re-executing the first virtual memory transaction comprises: determining whether there is an entry in a translation lookaside buffer corresponding to the first virtual memory transaction; and if the entry exists, then completing the first virtual memory translation, or if the entry does not exist, then re-storing the first virtual memory transaction in the replay buffer. 7. The method of claim 1 , wherein resolving the first page fault comprises: locating a memory page related to the first virtual memory transaction within a first memory subsystem based on a global translation table; and adding a virtual mapping for the memory page to a translation lookaside buffer. 8. The method of claim 7 , wherein resolving the first page fault further comprises copying the memory page from the first memory subsystem to a second memory subsystem. 9. The method of claim 8 , wherein the first memory subsystem comprises memory coupled to a central processing unit, and the second memory subsystem comprises memory coupled to the multithreaded processing unit. 10. A non-transitory computer-readable storage medium including instructions that, when executed by a multithreaded processing unit, cause the multithreaded processing unit to process virtual memory transactions by performing the steps of: receiving a first virtual memory transaction from a first unit; attempting to execute the first virtual memory transaction; detecting a first page fault related to the first virtual memory transaction; storing the first virtual memory transaction in a replay buffer; causing a stall condition that inhibits the first unit from generating subsequent virtual memory transactions until the first page fault has been resolved; and once the first page fault has been resolved, re-executing the first virtual memory transaction as well as at least one other virtual memory transaction stored in the replay buffer. 11. The non-transitory computer-readable storage medium of claim 10 , further comprising determining that the replay buffer is empty and enabling the first unit to generate subsequent virtual memory transactions. 12. The non-transitory computer-readable storage medium of claim 10 , further comprising receiving a second virtual memory transaction from a second unit while the first page fault is unresolved, and successfully executing the second virtual memory transaction. 13. The non-transitory computer-readable storage medium of claim 10 , further comprising: receiving a second virtual memory transaction from the first unit prior to detecting the first page fault; detecting a second page fault related to the second virtual memory transaction; and storing the second virtual memory transaction in the replay buffer. 14. The non-transitory computer-readable storage medium of claim 10 , further comprising invalidating a translation lookaside buffer prior to re-executing the first virtual memory transaction. 15. The non-transitory computer-readable storage medium of claim 10 , wherein re-executing the first virtual memory transaction comprises: determining whether there is an entry in a translation lookaside buffer corresponding to the first virtual memory transaction; and if the entry exists, then completing the first virtual memory translation, or if the entry does not exist, then re-storing the first virtual memory transaction in the replay buffer. 16. The non-transitory computer-readable storage medium of claim 10 , wherein resolving the first page fault comprises: locating a memory page related to the first virtual memory transaction within a first memory subsystem based on a global translation table; and adding a virtual mapping for the memory page to a translation lookaside buffer. 17. The non-transitory computer-readable storage medium of claim 16 , wherein resolving the first page fault further comprises copying the memory page from the first memory subsystem to a second memory subsystem. 18. The non-transitory computer-readable storage medium of claim 17 , wherein the first memory subsystem comprises memory coupled to a central processing unit, and the second memory subsystem comprises memory coupled to the multithreaded processing unit. 19. A system configured to process virtual memory transactions, the system comprising: a memory; and a multithreaded processing unit coupled to the memory and configured to: receive a first virtual memory transaction from a first unit; attempt to execute the first virtual memory transaction on the memory; detect a first page fault related to the first virtual memory transaction; store the first virtual memory transaction in a replay buffer; cause a stall condition that inhibits the first unit from generating subsequent virtual memory transactions until the first page fault has been resolved; and once the first page fault has been resolved, re-execute the first virtual memory transaction as well as at least one other virtual memory transaction stored in the replay buffer. 20. The system of claim 19 , wherein the multithreaded processing unit is further configured to receive a second virtual memory transaction from a second unit while the first page fault is unresolved, and successfully execute the second virtual memory transaction.

Assignees

Inventors

Classifications

  • In special purpose processing node, e.g. vector processor · CPC title

  • using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] · CPC title

  • G06F12/08Primary

    in hierarchically structured memory systems, e.g. virtual memory systems · CPC title

  • TLB miss handling · CPC title

  • Transactional memory (G06F9/528 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9575892B2 cover?
One embodiment of the present invention is a parallel processing unit (PPU) that includes one or more streaming multiprocessors (SMs) and implements a replay unit per SM. Upon detecting a page fault associated with a memory transaction issued by a particular SM, the corresponding replay unit causes the SM, but not any unaffected SMs, to cease issuing new memory transactions. The replay unit the…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06F12/1027. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).