Replaying memory transactions while resolving memory access faults

US9830276B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9830276-B2
Application numberUS-201715437400-A
CountryUS
Kind codeB2
Filing dateFeb 20, 2017
Priority dateMar 15, 2013
Publication dateNov 28, 2017
Grant dateNov 28, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment of the present invention is a parallel processing unit (PPU) that includes one or more streaming multiprocessors (SMs) and implements a replay unit per SM. Upon detecting a page fault associated with a memory transaction issued by a particular SM, the corresponding replay unit causes the SM, but not any unaffected SMs, to cease issuing new memory transactions. The replay unit then stores the faulting memory transaction and any faulting in-flight memory transaction in a replay buffer. As page faults are resolved, the replay unit replays the memory transactions in the replay buffer—removing successful memory transactions from the replay buffer—until all of the stored memory transactions have successfully executed. Advantageously, the overall performance of the PPU is improved compared to conventional PPUs that, upon detecting a page fault, stop performing memory transactions across all SMs included in the PPU until the fault is resolved.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving a first virtual memory transaction from a first processor; attempting to execute the first virtual memory transaction; detecting a first page fault related to the first virtual memory transaction; and causing a stall condition that inhibits the first processor from generating subsequent virtual memory transactions until the first page fault has been resolved. 2. The method of claim 1 , further comprising re-executing the first virtual memory transaction once the stall condition has been resolved. 3. The method of claim 2 , further comprising invalidating a translation lookaside buffer prior to re-executing the first virtual memory transaction. 4. The method of claim 2 , wherein re-executing the first virtual memory transaction comprises: determining whether a translation lookaside buffer includes an entry corresponding to the first virtual memory transaction; and if the translation lookaside buffer includes the entry, then completing a virtual memory translation for the first virtual memory transaction, or if the translation lookaside buffer does not include the entry, then storing the first virtual memory transaction in a replay buffer. 5. The method of claim 2 , wherein the first virtual memory transaction is re-executed along with at least one other virtual memory transaction stored in the replay buffer. 6. The method of claim 1 , further comprising determining that the replay buffer is empty and enabling the first processor to generate subsequent virtual memory transactions. 7. The method of claim 1 , further comprising receiving a second virtual memory transaction from a second processor while the first page fault remains unresolved, and successfully executing the second virtual memory transaction. 8. The method of claim 1 , further comprising: receiving a second virtual memory transaction from the first processor prior to detecting the first page fault; detecting a second page fault related to the second virtual memory transaction; and storing the second virtual memory transaction in the replay buffer. 9. The method of claim 1 , wherein resolving the first page fault comprises: locating a memory page related to the first virtual memory transaction within a first memory based on a global translation table; and adding a virtual mapping for the memory page to a translation lookaside buffer. 10. The method of claim 9 , wherein resolving the first page fault further comprises copying the memory page from the first memory to a second memory. 11. The method of claim 10 , wherein the first memory comprises a system memory coupled to a central processing unit, and the second memory comprises a memory coupled to a multithreaded processing unit. 12. A non-transitory computer-readable storage medium including instructions that, when executed by a multithreaded processing unit, cause the multithreaded processing unit to perform the steps of: receiving a first virtual memory transaction from a first processor; attempting to execute the first virtual memory transaction; detecting a first page fault related to the first virtual memory transaction; and causing a stall condition that inhibits the first processor from generating subsequent virtual memory transactions until the first page fault has been resolved. 13. A system, comprising: a memory; and a multithreaded processing unit coupled to the memory and configured to: receive a first virtual memory transaction from a first processor; attempt to execute the first virtual memory transaction; detect a first page fault related to the first virtual memory transaction; and cause a stall condition that inhibits the first processor from generating subsequent virtual memory transactions until the first page fault has been resolved. 14. The system of claim 13 , wherein the multithreaded processor is further configured to re-execute the first virtual memory transaction once the stall condition has been resolved.

Assignees

Inventors

Classifications

  • Transactional memory (G06F9/528 takes precedence) · CPC title

  • In special purpose processing node, e.g. vector processor · CPC title

  • G06F12/08Primary

    in hierarchically structured memory systems, e.g. virtual memory systems · CPC title

  • TLB miss handling · CPC title

  • using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9830276B2 cover?
One embodiment of the present invention is a parallel processing unit (PPU) that includes one or more streaming multiprocessors (SMs) and implements a replay unit per SM. Upon detecting a page fault associated with a memory transaction issued by a particular SM, the corresponding replay unit causes the SM, but not any unaffected SMs, to cease issuing new memory transactions. The replay unit the…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06F12/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).