Apparatus and method for determining a sector division ratio of a shared cache memory
US-2015339229-A1 · Nov 26, 2015 · US
US11995029B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11995029-B2 |
| Application number | US-202017428527-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 14, 2020 |
| Priority date | Mar 15, 2019 |
| Publication date | May 28, 2024 |
| Grant date | May 28, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Multi-tile Memory Management for Detecting Cross Tile Access, Providing Multi-Tile Inference Scaling with multicasting of data via copy operation, and Providing Page Migration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a memory and a memory controller, a second graphics processing unit (GPU) having a memory and a cross-GPU fabric to communicatively couple the first and second GPUs. The memory controller is configured to determine whether frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU in the multi-GPU configuration and to send a message to initiate a data transfer mechanism when frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU.
Opening claim text (preview).
What is claimed is: 1. A graphics processor having a multi-tile architecture, comprising: a first graphics processing unit (GPU) having a memory and a memory controller; a second graphics processing unit (GPU) having a memory; and a cross-GPU fabric to communicatively couple the first and second GPUs, wherein the memory controller is configured to determine whether frequent cross tile memory accesses occur between the first GPU and the second GPU in the multi-GPU configuration and to cause initiation of a data transfer between the memory of the first GPU and the memory of the second GPU when frequent cross tile memory accesses occur between the first GPU and the second GPU, wherein the memory controller is configured to detect transfer patterns automatically including accesses to page N of the memory of the second GPU and to start transferring pages N+1 and N+2 prior to requests for pages N+1 and N+2. 2. The graphics processor of claim 1 , further comprising: a hardware counter to count cross tile memory accesses between the first GPU and the second GPU. 3. The graphics processor of claim 2 , wherein the memory controller is configured to determine whether frequent cross tile memory accesses occur between the first GPU and the second GPU in the multi-GPU configuration using data from the hardware counter. 4. The graphics processor of claim 3 , wherein the memory controller is configured to cause data that is being accessed frequently by the second GPU to be transferred or copied to the memory of the second GPU. 5. The graphics processor of claim 1 , wherein the memory controller is configured to cause data that is being accessed frequently by the first GPU to be transferred or copied to the memory of the first GPU. 6. The graphics processor of claim 1 , wherein the memory controller is configured to detect transfer patterns automatically including accesses between the first and second GPUs. 7. A graphics processing unit (GPU) of a multi-GPU architecture, comprising: processing resources to perform graphics operations; a memory; and a memory controller, wherein the memory controller is configured to determine whether frequent cross tile memory accesses occur between the GPU and a remote memory of a remote GPU in the multi-GPU configuration and to cause initiation of a data transfer between the memory of the GPU and the remote memory of the remote GPU when frequent cross tile memory accesses occur between the GPU and the remote memory of the remote GPU, wherein the memory controller is configured to detect transfer patterns automatically including accesses to page N of the remote memory and to start transferring pages N+1 and N+2 prior to requests for pages N+1 and N+2. 8. The GPU of claim 7 , further comprising: a hardware counter to count cross tile memory accesses from the GPU to the remote memory of the remote GPU. 9. The GPU of claim 8 , wherein the memory controller is configured to determine whether frequent cross tile memory accesses occur between the GPU and the remote memory of the remote GPU in the multi-GPU configuration using data from the hardware counter. 10. The GPU of claim 9 , wherein the memory controller is configured to cause data that is being accessed frequently by the remote GPU to be transferred or copied to the remote memory. 11. The GPU of claim 7 , wherein the memory controller is configured to cause data that is being accessed frequently by the GPU to be transferred or copied to the memory of the GPU. 12. The GPU of claim 7 , wherein the memory controller is configured to detect transfer patterns automatically between the GPU and the remote GPU. 13. A computer-implemented method to provide a data transfer mechanism for a multiple GPU configuration, the computer-implemented method comprises: monitoring cross tile memory accesses from a local GPU to one or more remote GPUs in the multi-GPU configuration; determining, with a memory controller, whether frequent cross tile memory accesses occur from a local GPU to one or more remote GPUs in the multi-GPU configuration; and sending a message to initiate the data transfer mechanism between a memory of the local GPU and a remote memory of a remote GPU when frequent cross tile memory accesses occur from the local GPU to the remote memory of the remote GPU in the multi-GPU configuration, wherein the data transfer mechanism to transfer or copy the data that is being accessed frequently by the local GPU to the memory of the local GPU and to local memory of at least one other GPU. 14. The computer-implemented method of claim 13 , further comprising: receiving, with a graphics driver, the message from the memory controller and to provide the data transfer mechanism in response to receiving the message. 15. The computer-implemented method of claim 13 , wherein the data transfer mechanism accesses a page table to provide a translation of virtual addresses to physical addresses. 16. The computer-implemented method of claim 13 , wherein the data transfer mechanism to transfer or copy the data that is being accessed frequently by the local GPU to multiple tiles or GPUs to enable split frame rendering with a first GPU handling rendering for a first portion of a display and a second GPU handling rendering for a second different portion of the display. 17. The computer-implemented method of claim 13 , further comprising: performing a page allocation to the memory of the local GPU when a first access to a page in a remote GPU memory occurs.
Page size control · CPC title
Details relating to cache mapping · CPC title
Prefetching based on hints or prefetch instructions · CPC title
Prefetching based on access pattern detection, e.g. stride based prefetch · CPC title
Reconfiguration of cache memory · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.