Unified memory systems and methods
US-2019266695-A1 · Aug 29, 2019 · US
US12373912B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12373912-B2 |
| Application number | US-202318511074-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 16, 2023 |
| Priority date | Mar 15, 2019 |
| Publication date | Jul 29, 2025 |
| Grant date | Jul 29, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments are generally directed to memory prefetching in multiple GPU environment. An embodiment of an apparatus includes multiple processors including a host processor and multiple graphics processing units (GPUs) to process data, each of the GPUs including a prefetcher and a cache; and a memory for storage of data, the memory including a plurality of memory elements, wherein the prefetcher of each of the GPUs is to prefetch data from the memory to the cache of the GPU; and wherein the prefetcher of a GPU is prohibited from prefetching from a page that is not owned by the GPU or by the host processor.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: one or more processors including a graphics processing unit (GPU) to process data, the GPU including: one or more processing cores, including a first core, a prefetcher, and one or more caches; and a memory for storage of data; wherein the prefetcher is to prefetch data from the memory to a cache of the one or more caches; wherein, upon completion of a prefetch operation by the GPU to prefetch data for a first thread running on the first core, the prefetcher is to issue a prefetch status notification to the first thread. 2. The apparatus of claim 1 , wherein issuance of the prefetch status notification indicates that the data prefetched for the first thread has been loaded into the cache. 3. The apparatus of claim 1 , wherein the GPU is to synchronize execution of the first thread with one or more other threads based at least in part on the prefetch status notification. 4. The apparatus of claim 1 , wherein the GPU is to throttle one or more prefetches for the first thread based at least in part on the prefetch status notification. 5. The apparatus of claim 1 , wherein the prefetch status notification is a one-bit flag. 6. The apparatus of claim 1 , wherein the first core is a shader core. 7. One or more non-transitory computer-readable storage mediums having stored thereon executable computer program instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: performing, by a prefetcher of a graphics processing unit (GPU) in a computing system, a prefetch operation to prefetch data for a first thread running on a first core of the GPU, the data being prefetched from a computer memory to a cache of the GPU; and upon completion of the prefetch operation, issuing, by the prefetcher, a prefetch status notification to the first thread. 8. The one or more computer-readable storage mediums of claim 7 , wherein issuance of the prefetch status notification indicates that the data prefetched for the first thread has been loaded into the cache. 9. The one or more computer-readable storage mediums of claim 7 , wherein the instructions further include instructions for: synchronizing execution of the first thread with one or more other threads based at least in part on the prefetch status notification. 10. The one or more computer-readable storage mediums of claim 7 , wherein the instructions further include instructions for: throttling one or more prefetches for the first thread based at least in part on the prefetch status notification. 11. The one or more computer-readable storage mediums of claim 7 , wherein the prefetch status notification is a one-bit flag. 12. The one or more computer-readable storage mediums of claim 7 , wherein the first core is a shader core. 13. A method comprising: performing, by a prefetcher of a graphics processing unit (GPU) in a computing system, a prefetch operation to prefetch data for a first thread running on a first core of the GPU, the data being prefetched from a computer memory to a cache of the GPU; and upon completion of the prefetch operation, issuing, by the prefetcher, a prefetch status notification to the first thread. 14. The method of claim 13 , wherein issuance of the prefetch status notification indicates that the data prefetched for the first thread has been loaded into the cache. 15. The method of claim 13 , further comprising: synchronizing execution of the first thread with one or more other threads based at least in part on the prefetch status notification. 16. The method of claim 13 , further comprising: throttling one or more prefetches for the first thread based at least in part on the prefetch status notification. 17. The method of claim 13 , wherein the prefetch status notification is a one-bit flag.
General purpose rendering architectures · CPC title
Memory management · CPC title
using a secondary processor, e.g. coprocessor (peripheral processor G06F13/12) · CPC title
Instruction prefetching · CPC title
using page tables, e.g. page table structures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.