Providing extended cache replacement state information
US-9170955-B2 · Oct 27, 2015 · US
US12386779B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12386779-B2 |
| Application number | US-202418432859-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 5, 2024 |
| Priority date | Mar 15, 2019 |
| Publication date | Aug 12, 2025 |
| Grant date | Aug 12, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments described herein provide techniques to enable the dynamic reconfiguration of memory on a general-purpose graphics processing unit. One embodiment described herein enables dynamic reconfiguration of cache memory bank assignments based on hardware statistics. One embodiment enables for virtual memory address translation using mixed four kilobyte and sixty-four kilobyte pages within the same page table hierarchy and under the same page directory. One embodiment provides for a graphics processor and associated heterogenous processing system having near and far regions of the same level of a cache hierarchy.
Opening claim text (preview).
What is claimed is: 1. A general-purpose graphics processor comprising: an interface to a host processor; a memory interface; a processing array including a plurality of graphics processing resources, the processing array coupled with a memory via the memory interface; and memory management circuitry configured to: receive a request to map to a 4 kilobyte page of virtual memory associated with the host processor; create a mapping in a page table hierarchy for the 4 kilobyte page; receive a request to map to a 64 kilobyte page of virtual memory associated with processing array; and create a mapping in the page table hierarchy for the 64 kilobyte page, wherein the page table hierarchy is to concurrently store entries for the 4 kilobyte page and the 64 kilobyte page. 2. The general-purpose graphics processor of claim 1 , wherein the memory management circuitry is configured to facilitate a unified virtual memory system that includes memory of the host processor and memory coupled with the processing array via the memory interface. 3. The general-purpose graphics processor of claim 2 , wherein the page table hierarchy includes a page table that is shared between the processing array and the host processor. 4. The general-purpose graphics processor of claim 3 , wherein the page table that is shared between the processing array and the host processor is configured to support a 49-bit address space for the processing array and a 48-bit address space for the host processor. 5. The general-purpose graphics processor of claim 1 , wherein the page table hierarchy includes a page directory pointer table, a page directory table, a first page table configure to store page table entries for 64 kilobyte pages, and a second page table configured to store page table entries for 64 kilobyte pages and 4 kilobyte pages. 6. The general-purpose graphics processor of claim 5 , comprising a translation lookaside buffer to cache entries within the page table hierarchy and a page table walker to walk the page table hierarchy in response to a cache miss at the translation lookaside buffer. 7. The general-purpose graphics processor of claim 6 , wherein the translation lookaside buffer is configurable to cache sixteen contiguous 4 kilobyte page table entries of the second page table as a 64 kilobyte page table entry. 8. A method comprising: interfacing with a host processor at a general-purpose graphics processor; receiving a request at memory management circuitry of the general-purpose graphics processor to map a 4 kilobyte page of virtual memory associated with the host processor; creating a mapping in a page table hierarchy for the 4 kilobyte page; receiving a request to map to a 64 kilobyte page of virtual memory associated with processing array; and creating a mapping in the page table hierarchy for the 64 kilobyte page, wherein the page table hierarchy is to concurrently store entries for the 4 kilobyte page and the 64 kilobyte page. 9. The method of claim 8 , comprising facilitating, via the memory management circuitry of the general-purpose graphics processor, a unified virtual memory system that includes memory of the host processor and memory of the general-purpose graphics processor. 10. The method of claim 9 , comprising sharing the page table hierarchy between the general-purpose graphics processor and the host processor. 11. The method of claim 10 , comprising: receiving a request at the general-purpose graphics processor to translate a virtual address of a first memory allocation within a 4 kilobyte page; in response to a miss in a translation lookaside buffer of the general-purpose graphics processor for the virtual address for the first memory allocation, initiating a page walk into the page table hierarchy to determine a physical address of the first memory allocation; receiving a request at the general-purpose graphics processor to translate the virtual address of a second memory allocation within a 64 kilobyte page; and in response to a miss in a translation lookaside buffer of the general-purpose graphics processor for the virtual address for the second memory allocation, initiating a page walk into the page table hierarchy to determine a physical address of the second memory allocation, wherein the page table hierarchy stores a first page table entry for the 4 kilobyte page and a second page table entry for the 64 kilobyte page. 12. The method of claim 11 , comprising: storing the first page table entry in a first page table of the page table hierarchy, the first page table configured to store page table entries for 4 kilobyte pages; and storing the second page table entry in a second page table of the page table hierarchy, the second page table configured to store page table entries for 4 kilobyte pages and 64 kilobyte pages. 13. The method of claim 11 , comprising storing the first page table entry and the second page table entry in a page table configured to store page table entries for 4 kilobyte pages and 64 kilobyte pages. 14. The method of claim 13 , comprising marking sixteen consecutive page table entries for 4 kilobyte pages as cacheable within a translation lookaside buffer as a single 64 kilobyte page. 15. A data processing system comprising: a host processor; and a general-purpose graphics processor coupled with the host processor via a host interface, the general-purpose graphics processor including: a memory device; a processing array including a plurality of graphics processing resources, the processing array coupled with a memory device via a memory interface; and memory management circuitry configured to: receive a request to map to a 4 kilobyte page of virtual memory associated with the host processor; create a mapping in a page table hierarchy for the 4 kilobyte page; receive a request to map to a 64 kilobyte page of virtual memory associated with processing array; and create a mapping in the page table hierarchy for the 64 kilobyte page, wherein the page table hierarchy is to concurrently store entries for the 4 kilobyte page and the 64 kilobyte page. 16. The data processing system of claim 15 , wherein the memory management circuitry is configured to facilitate a unified virtual memory system that includes the memory device and memory of the host processor. 17. The data processing system of claim 16 , wherein the page table hierarchy includes a page table that is shared between the processing array and the host processor. 18. The data processing system of claim 17 , wherein the page table that is shared between the processing array and the host processor is configured to support a 49-bit address space for the processing array and a 48-bit address space for the host processor. 19. The data processing system of claim 15 , wherein the page table hierarchy includes a page directory pointer table, a page directory table, a first page table configure to store page table entries for 64 kilobyte pages, and a second page table configured to store page table entries for 64 kilobyte pages and 4 kilobyte pages. 20. The data processing system of claim 19 , comprising: a translation lookaside buffer to cache entries within the page table hierarchy; and a page table walker to walk the page table hierarchy in response to a cache miss at the translation lookaside buffer, wherein the translation lookaside buffer is configurable to cache sixteen contiguous 4 kilobyte page table entries of the second page table as a 64 kilobyte page table entry.
Page size control · CPC title
Details relating to cache mapping · CPC title
Prefetching based on hints or prefetch instructions · CPC title
Prefetching based on access pattern detection, e.g. stride based prefetch · CPC title
Reconfiguration of cache memory · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.