Multi-tile memory management for detecting cross tile access providing multi-tile inference scaling and providing page migration

US11995029B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11995029-B2
Application numberUS-202017428527-A
CountryUS
Kind codeB2
Filing dateMar 14, 2020
Priority dateMar 15, 2019
Publication dateMay 28, 2024
Grant dateMay 28, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Multi-tile Memory Management for Detecting Cross Tile Access, Providing Multi-Tile Inference Scaling with multicasting of data via copy operation, and Providing Page Migration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a memory and a memory controller, a second graphics processing unit (GPU) having a memory and a cross-GPU fabric to communicatively couple the first and second GPUs. The memory controller is configured to determine whether frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU in the multi-GPU configuration and to send a message to initiate a data transfer mechanism when frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU.

First claim

Opening claim text (preview).

What is claimed is: 1. A graphics processor having a multi-tile architecture, comprising: a first graphics processing unit (GPU) having a memory and a memory controller; a second graphics processing unit (GPU) having a memory; and a cross-GPU fabric to communicatively couple the first and second GPUs, wherein the memory controller is configured to determine whether frequent cross tile memory accesses occur between the first GPU and the second GPU in the multi-GPU configuration and to cause initiation of a data transfer between the memory of the first GPU and the memory of the second GPU when frequent cross tile memory accesses occur between the first GPU and the second GPU, wherein the memory controller is configured to detect transfer patterns automatically including accesses to page N of the memory of the second GPU and to start transferring pages N+1 and N+2 prior to requests for pages N+1 and N+2. 2. The graphics processor of claim 1 , further comprising: a hardware counter to count cross tile memory accesses between the first GPU and the second GPU. 3. The graphics processor of claim 2 , wherein the memory controller is configured to determine whether frequent cross tile memory accesses occur between the first GPU and the second GPU in the multi-GPU configuration using data from the hardware counter. 4. The graphics processor of claim 3 , wherein the memory controller is configured to cause data that is being accessed frequently by the second GPU to be transferred or copied to the memory of the second GPU. 5. The graphics processor of claim 1 , wherein the memory controller is configured to cause data that is being accessed frequently by the first GPU to be transferred or copied to the memory of the first GPU. 6. The graphics processor of claim 1 , wherein the memory controller is configured to detect transfer patterns automatically including accesses between the first and second GPUs. 7. A graphics processing unit (GPU) of a multi-GPU architecture, comprising: processing resources to perform graphics operations; a memory; and a memory controller, wherein the memory controller is configured to determine whether frequent cross tile memory accesses occur between the GPU and a remote memory of a remote GPU in the multi-GPU configuration and to cause initiation of a data transfer between the memory of the GPU and the remote memory of the remote GPU when frequent cross tile memory accesses occur between the GPU and the remote memory of the remote GPU, wherein the memory controller is configured to detect transfer patterns automatically including accesses to page N of the remote memory and to start transferring pages N+1 and N+2 prior to requests for pages N+1 and N+2. 8. The GPU of claim 7 , further comprising: a hardware counter to count cross tile memory accesses from the GPU to the remote memory of the remote GPU. 9. The GPU of claim 8 , wherein the memory controller is configured to determine whether frequent cross tile memory accesses occur between the GPU and the remote memory of the remote GPU in the multi-GPU configuration using data from the hardware counter. 10. The GPU of claim 9 , wherein the memory controller is configured to cause data that is being accessed frequently by the remote GPU to be transferred or copied to the remote memory. 11. The GPU of claim 7 , wherein the memory controller is configured to cause data that is being accessed frequently by the GPU to be transferred or copied to the memory of the GPU. 12. The GPU of claim 7 , wherein the memory controller is configured to detect transfer patterns automatically between the GPU and the remote GPU. 13. A computer-implemented method to provide a data transfer mechanism for a multiple GPU configuration, the computer-implemented method comprises: monitoring cross tile memory accesses from a local GPU to one or more remote GPUs in the multi-GPU configuration; determining, with a memory controller, whether frequent cross tile memory accesses occur from a local GPU to one or more remote GPUs in the multi-GPU configuration; and sending a message to initiate the data transfer mechanism between a memory of the local GPU and a remote memory of a remote GPU when frequent cross tile memory accesses occur from the local GPU to the remote memory of the remote GPU in the multi-GPU configuration, wherein the data transfer mechanism to transfer or copy the data that is being accessed frequently by the local GPU to the memory of the local GPU and to local memory of at least one other GPU. 14. The computer-implemented method of claim 13 , further comprising: receiving, with a graphics driver, the message from the memory controller and to provide the data transfer mechanism in response to receiving the message. 15. The computer-implemented method of claim 13 , wherein the data transfer mechanism accesses a page table to provide a translation of virtual addresses to physical addresses. 16. The computer-implemented method of claim 13 , wherein the data transfer mechanism to transfer or copy the data that is being accessed frequently by the local GPU to multiple tiles or GPUs to enable split frame rendering with a first GPU handling rendering for a first portion of a display and a second GPU handling rendering for a second different portion of the display. 17. The computer-implemented method of claim 13 , further comprising: performing a page allocation to the memory of the local GPU when a first access to a page in a remote GPU memory occurs.

Assignees

Inventors

Classifications

  • Page size control · CPC title

  • Details relating to cache mapping · CPC title

  • Prefetching based on hints or prefetch instructions · CPC title

  • Prefetching based on access pattern detection, e.g. stride based prefetch · CPC title

  • Reconfiguration of cache memory · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11995029B2 cover?
Multi-tile Memory Management for Detecting Cross Tile Access, Providing Multi-Tile Inference Scaling with multicasting of data via copy operation, and Providing Page Migration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a memory and a memory controller, a second graphics processing unit (GPU) …
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 28 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).