Method and system for work scheduling in a multi-chip system
US-2015254104-A1 · Sep 10, 2015 · US
US12561276B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12561276-B2 |
| Application number | US-202017428534-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 14, 2020 |
| Priority date | Mar 15, 2019 |
| Publication date | Feb 24, 2026 |
| Grant date | Feb 24, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for updating remote memory side caches in a multi-GPU configuration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a first memory, a first memory side cache memory, a first communication fabric, and a first memory management unit (MMU). The graphics processor includes a second graphics processing unit (GPU) having a second memory, a second memory side cache memory, a second memory management unit (MMU), and a second communication fabric that is communicatively coupled to the first communication fabric. The first MMU is configured to control memory requests for the first memory, to update content in the first memory, to update content in the first memory side cache memory, and to determine whether to update the content in the second memory side cache memory.
Opening claim text (preview).
What is claimed is: 1 . A graphics processing unit (GPU) in a multi-GPU architecture, the GPU comprising: processing resources to perform graphics operations; a first memory side cache that is adjacent to a memory associated with a first side of a cross-GPU communication fabric; and memory management unit (MMU) circuitry associated with a second side of the cross-GPU communication fabric, wherein the MMU circuitry is configured to control memory requests for the memory, to request content from an address in the memory, to update the content in the first memory side cache, and to determine whether to update the content in a second memory side cache associated with another GPU, wherein the MMU circuitry is further to determine ownership of physical addresses based on a distributed ownership scheme across multiple GPUs, wherein the cross-GPU communication fabric includes a cross-GPU switch for facilitating communications between the GPU and another GPU, wherein the cross-GPU communication fabric further includes dedicated GPU-to-GPU links that bypass system memory for inter-GPU cache coherency operations, wherein the content is updated in the first memory side cache when the MMU circuitry determines the GPU owns a requested physical address that is associated with the content. 2 . The graphics processing unit (GPU) of claim 1 , wherein the update associated with the first memory side cache or the second memory side cache is tracked using a fence. 3 . The graphics processing unit (GPU) of claim 1 , wherein the MMU circuitry is further configured to determine when to synchronize the second memory side cache with the memory based on an ordering hint. 4 . The graphics processing unit (GPU) of claim 1 , wherein the content is updated in the second memory side cache when the MMU circuitry determines that the another GPU owns the requested physical address that is associated with the content. 5 . A system having multiple graphics processing units (GPUs), the system comprising: a first GPU in communication with a second GPU, the first GPU comprising: processing resources to perform graphics operations; a first memory side cache that is adjacent to a memory associated with a first side of a cross-GPU communication fabric; and memory management unit (MMU) circuitry associated with a second side of the cross-GPU communication fabric, wherein the MMU circuitry is configured to control memory requests for the memory, to request content from an address in the memory, to update the content in the first memory side cache, and to determine whether to update the content in a second memory side cache associated with the second GPU, wherein the MMU circuitry is further to determine ownership of physical addresses based on a distributed ownership scheme across multiple GPUs, wherein the cross-GPU communication fabric includes a cross-GPU switch for facilitating communications between the first GPU and the second GPU, wherein the cross-GPU communication fabric further includes dedicated GPU-to-GPU links that bypass system memory for inter-GPU cache coherency operations, wherein the content is updated in the first memory side cache when the MMU circuitry determines the first GPU owns a requested physical address that is associated with the content. 6 . The system of claim 5 , wherein the update associated with the first memory side cache or a second memory side cache is tracked using a fence. 7 . The system of claim 5 , wherein the MMU circuitry is further configured to determine when to synchronize the second memory side cache with the memory based on an ordering hint. 8 . The system of claim 5 , wherein the content is updated in the second memory side cache when the MMU circuitry determines the second GPU owns the requested physical address that is associated with the content. 9 . A method comprising: a first memory side cache that is adjacent to a memory associated with a first side of a cross-graphics processing unit (GPU) communication fabric associated with a first GPU; and controlling memory requests for a memory, and requesting content from an address in the memory, and updating the content in a first memory side cache, and determining whether to update the content in a second memory side cache associated with a second GPU, wherein the MMU circuitry is further to determine ownership of physical addresses based on a distributed ownership scheme across multiple GPUs, wherein the cross-GPU communication fabric includes a cross-GPU switch for facilitating communications between the first GPU and the second GPU, wherein the cross-GPU communication fabric further includes dedicated GPU-to-GPU links that bypass system memory for inter-GPU cache coherency operations, wherein the content is updated in the first memory side cache when the MMU circuitry determines the GPU owns a requested physical address associated with the content. 10 . The method of claim 9 , wherein the update associated with the first memory side cache or the second memory side cache is tracked using a fence. 11 . The method of claim 9 , further comprising determining when to synchronize the second memory side cache with the memory based on an ordering hint.
by reordering requests · CPC title
Ray-tracing · CPC title
Loop control instructions; iterative instructions, e.g. LOOP, REPEAT · CPC title
controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title
LOAD or STORE instructions; Clear instruction · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.