Apparatus and method for determining a sector division ratio of a shared cache memory
US-2015339229-A1 · Nov 26, 2015 · US
US12079155B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12079155-B2 |
| Application number | US-202017428216-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 14, 2020 |
| Priority date | Mar 15, 2019 |
| Publication date | Sep 3, 2024 |
| Grant date | Sep 3, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments described herein include software, firmware, and hardware that provides techniques to enable deterministic scheduling across multiple general-purpose graphics processing units. One embodiment provides a multi-GPU architecture with uniform latency. One embodiment provides techniques to distribute memory output based on memory chip thermals. One embodiment provides techniques to enable thermally aware workload scheduling. One embodiment provides techniques to enable end to end contracts for workload scheduling on multiple GPUs.
Opening claim text (preview).
What is claimed is: 1. A general-purpose graphics processor comprising: a memory access pipeline configured to access a memory system having physically interleaved memory addressing, the memory system including: a first memory device that is local to the general-purpose graphics processor; and a second memory device that is remote to the general-purpose graphics processor and local to a first remote general-purpose graphics processor, wherein the memory access pipeline includes hardware to facilitate access to physically interleaved memory pages of the memory system, the physically interleaved memory pages include a first physical memory page and a second physical memory page, the first physical memory page to be stored on the first memory device and the second physical memory page to be stored on the second memory device, and wherein the first physical memory page is to be contiguous with the second physical memory page and the hardware of the memory access pipeline is to satisfy memory access requests for contiguous physical memory pages from the first memory device and the second memory device to cause memory access latency for the memory access requests to be an average of the memory access latency of the first memory device and the second memory device. 2. The general-purpose graphics processor as in claim 1 , wherein the memory system includes a third memory device that is local to a second remote general-purpose graphics processor and remote to the general-purpose graphics processor and the first remote general-purpose graphics processor, the third memory device to include a third physical memory page configured to be contiguous with the first physical memory page and the second physical memory page. 3. The general-purpose graphics processor as in claim 2 , wherein the memory access pipeline includes hardware to: determine that physical memory addresses are to be interleaved across multiple memory devices including the first memory device, the second memory device, and the third memory device; map multiple contiguous physical memory pages to the multiple memory devices; and satisfy memory access requests for the multiple contiguous physical memory pages from the multiple memory devices to average memory access latency for the access requests. 4. The general-purpose graphics processor as in claim 3 , wherein the hardware of the memory access pipeline includes a memory management and mapping unit to map the multiple contiguous physical memory pages to multiple memory devices. 5. The general-purpose graphics processor as in claim 4 , wherein hardware of the memory management and mapping unit is included in one or more of a memory controller, memory management unit, or address generation unit. 6. The general-purpose graphics processor as in claim 1 , additionally comprising a point-to-point interconnect to the first remote general-purpose graphics processor. 7. The general-purpose graphics processor as in claim 6 , wherein the access to the second physical memory page stored on the second memory device is to traverse the point-to-point interconnect. 8. The general-purpose graphics processor as in claim 7 , wherein access latency to the first memory device via the memory access pipeline is lower than the access latency to the second memory device. 9. A method comprising: on graphics processing system having multiple general-purpose graphics processing units (GPGPUs): initializing a memory management system for two or more of the multiple GPGPUs; determining that physical memory addresses for the two or more of the multiple GPGPUs are to be interleaved across multiple memory devices; mapping physical memory pages for the physical memory addresses across the multiple memory devices; and satisfying an access request for multiple contiguous physical memory pages from the multiple memory devices to cause memory access latency for the access request to be an average of the memory access latencies of the multiple memory devices. 10. The method as in claim 9 , wherein determining that physical memory addresses for the two or more of the multiple GPGPUs are to be interleaved across multiple memory devices includes reading a set of configuration registers associated with the multiple memory devices. 11. The method as in claim 9 , wherein determining that physical memory addresses for the two or more of the multiple GPGPUs are to be interleaved across multiple memory devices includes determining that physical memory addresses for a first GPGPU are to be interleaved across a first memory device coupled with the first GPGPU and a second memory device coupled with a second GPGPU. 12. The method as in claim 11 , wherein mapping physical memory pages for the physical memory addresses across the multiple memory devices includes mapping a first memory page for the first GPGPU to the first memory device and mapping a second contiguous physical memory page for the first GPGPU to the second memory device. 13. A graphics processing system comprising: a first memory device; and a first general-purpose graphics processor coupled with the first memory device, the first general-purpose graphics processor comprising a memory access pipeline configured to access the first memory device and an interconnect to couple with a second general-purpose graphics processor, the second general-purpose graphics processor coupled with a second memory device, wherein: the memory access pipeline of the first general-purpose graphics processor includes hardware to facilitate access to a first physical memory page and a second physical memory page, the first physical memory page is to be stored on the first memory device, the second physical memory page is to be stored on the second memory device, the first physical memory page is to be contiguous with the second physical memory page and; the hardware of the memory access pipeline is to satisfy memory access requests for contiguous physical memory pages from the first memory device and the second memory device to cause memory access latency for the memory access requests to be an average of the memory access latency of the first memory device and the second memory device. 14. The graphics processing system as in claim 13 , comprising a third memory device coupled with a third general-purpose graphics processor, the third memory device to include a third physical memory page to be contiguous with the first physical memory page and the second physical memory page. 15. The graphics processing system as in claim 14 , wherein the memory access pipeline includes hardware to: determine that physical memory addresses are to be interleaved across multiple memory devices including the first memory device the second memory device, and the third memory device; map multiple contiguous physical memory pages to the multiple memory devices; and satisfy memory access requests for the multiple contiguous physical memory pages from the multiple memory devices to average memory access latency for the access requests. 16. The graphics processing system as in claim 15 , wherein the hardware of the memory access pipeline includes a memory management and mapping unit to map the multiple contiguous physical memory pages to multiple memory devices. 17. The graphics processing system as in claim 16 , wherein hardware of the memory management and mapping unit is included in one or more of a memory controller, a memory management unit, or an address generation unit. 18. The graphics processing system as in claim 13 , wherein the interconnect is a point-t
Page size control · CPC title
Details relating to cache mapping · CPC title
Prefetching based on hints or prefetch instructions · CPC title
Prefetching based on access pattern detection, e.g. stride based prefetch · CPC title
Reconfiguration of cache memory · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.