Address Range Based Memory Hints for Prefetcher, Cache and Memory Controller
US-2024385966-A1 · Nov 21, 2024 · US
US9575892B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9575892-B2 |
| Application number | US-201314109678-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 17, 2013 |
| Priority date | Mar 15, 2013 |
| Publication date | Feb 21, 2017 |
| Grant date | Feb 21, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One embodiment of the present invention is a parallel processing unit (PPU) that includes one or more streaming multiprocessors (SMs) and implements a replay unit per SM. Upon detecting a page fault associated with a memory transaction issued by a particular SM, the corresponding replay unit causes the SM, but not any unaffected SMs, to cease issuing new memory transactions. The replay unit then stores the faulting memory transaction and any faulting in-flight memory transaction in a replay buffer. As page faults are resolved, the replay unit replays the memory transactions in the replay buffer—removing successful memory transactions from the replay buffer—until all of the stored memory transactions have successfully executed. Advantageously, the overall performance of the PPU is improved compared to conventional PPUs that, upon detecting a page fault, stop performing memory transactions across all SMs included in the PPU until the fault is resolved.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for processing virtual memory transactions associated with a multithreaded processing unit, the method comprising: receiving a first virtual memory transaction from a first unit; attempting to execute the first virtual memory transaction; detecting a first page fault related to the first virtual memory transaction; storing the first virtual memory transaction in a replay buffer; causing a stall condition that inhibits the first unit from generating subsequent virtual memory transactions until the first page fault has been resolved; and once the first page fault has been resolved, re-executing the first virtual memory transaction as well as at least one other virtual memory transaction stored in the replay buffer. 2. The method of claim 1 , further comprising determining that the replay buffer is empty and enabling the first unit to generate subsequent virtual memory transactions. 3. The method of claim 1 , further comprising receiving a second virtual memory transaction from a second unit while the first page fault is unresolved, and successfully executing the second virtual memory transaction. 4. The method of claim 1 , further comprising: receiving a second virtual memory transaction from the first unit prior to detecting the first page fault; detecting a second page fault related to the second virtual memory transaction; and storing the second virtual memory transaction in the replay buffer. 5. The method of claim 1 , further comprising invalidating a translation lookaside buffer prior to re-executing the first virtual memory transaction. 6. The method of claim 1 , wherein re-executing the first virtual memory transaction comprises: determining whether there is an entry in a translation lookaside buffer corresponding to the first virtual memory transaction; and if the entry exists, then completing the first virtual memory translation, or if the entry does not exist, then re-storing the first virtual memory transaction in the replay buffer. 7. The method of claim 1 , wherein resolving the first page fault comprises: locating a memory page related to the first virtual memory transaction within a first memory subsystem based on a global translation table; and adding a virtual mapping for the memory page to a translation lookaside buffer. 8. The method of claim 7 , wherein resolving the first page fault further comprises copying the memory page from the first memory subsystem to a second memory subsystem. 9. The method of claim 8 , wherein the first memory subsystem comprises memory coupled to a central processing unit, and the second memory subsystem comprises memory coupled to the multithreaded processing unit. 10. A non-transitory computer-readable storage medium including instructions that, when executed by a multithreaded processing unit, cause the multithreaded processing unit to process virtual memory transactions by performing the steps of: receiving a first virtual memory transaction from a first unit; attempting to execute the first virtual memory transaction; detecting a first page fault related to the first virtual memory transaction; storing the first virtual memory transaction in a replay buffer; causing a stall condition that inhibits the first unit from generating subsequent virtual memory transactions until the first page fault has been resolved; and once the first page fault has been resolved, re-executing the first virtual memory transaction as well as at least one other virtual memory transaction stored in the replay buffer. 11. The non-transitory computer-readable storage medium of claim 10 , further comprising determining that the replay buffer is empty and enabling the first unit to generate subsequent virtual memory transactions. 12. The non-transitory computer-readable storage medium of claim 10 , further comprising receiving a second virtual memory transaction from a second unit while the first page fault is unresolved, and successfully executing the second virtual memory transaction. 13. The non-transitory computer-readable storage medium of claim 10 , further comprising: receiving a second virtual memory transaction from the first unit prior to detecting the first page fault; detecting a second page fault related to the second virtual memory transaction; and storing the second virtual memory transaction in the replay buffer. 14. The non-transitory computer-readable storage medium of claim 10 , further comprising invalidating a translation lookaside buffer prior to re-executing the first virtual memory transaction. 15. The non-transitory computer-readable storage medium of claim 10 , wherein re-executing the first virtual memory transaction comprises: determining whether there is an entry in a translation lookaside buffer corresponding to the first virtual memory transaction; and if the entry exists, then completing the first virtual memory translation, or if the entry does not exist, then re-storing the first virtual memory transaction in the replay buffer. 16. The non-transitory computer-readable storage medium of claim 10 , wherein resolving the first page fault comprises: locating a memory page related to the first virtual memory transaction within a first memory subsystem based on a global translation table; and adding a virtual mapping for the memory page to a translation lookaside buffer. 17. The non-transitory computer-readable storage medium of claim 16 , wherein resolving the first page fault further comprises copying the memory page from the first memory subsystem to a second memory subsystem. 18. The non-transitory computer-readable storage medium of claim 17 , wherein the first memory subsystem comprises memory coupled to a central processing unit, and the second memory subsystem comprises memory coupled to the multithreaded processing unit. 19. A system configured to process virtual memory transactions, the system comprising: a memory; and a multithreaded processing unit coupled to the memory and configured to: receive a first virtual memory transaction from a first unit; attempt to execute the first virtual memory transaction on the memory; detect a first page fault related to the first virtual memory transaction; store the first virtual memory transaction in a replay buffer; cause a stall condition that inhibits the first unit from generating subsequent virtual memory transactions until the first page fault has been resolved; and once the first page fault has been resolved, re-execute the first virtual memory transaction as well as at least one other virtual memory transaction stored in the replay buffer. 20. The system of claim 19 , wherein the multithreaded processing unit is further configured to receive a second virtual memory transaction from a second unit while the first page fault is unresolved, and successfully execute the second virtual memory transaction.
In special purpose processing node, e.g. vector processor · CPC title
using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] · CPC title
in hierarchically structured memory systems, e.g. virtual memory systems · CPC title
TLB miss handling · CPC title
Transactional memory (G06F9/528 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.