Data distribution fabric in scalable GPUs

US10346946B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10346946-B2
Application numberUS-201816039509-A
CountryUS
Kind codeB2
Filing dateJul 19, 2018
Priority dateJun 30, 2014
Publication dateJul 9, 2019
Grant dateJul 9, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In on embodiment, a hybrid fabric interconnects multiple graphics processor cores within a processor. The hybrid fabric interconnect includes multiple data channels, including programmable virtual data channels. The virtual data channels carry multiple traffic classes of packet-based messages. The virtual data channels and multiple traffic classes may be assigned one of multiple priorities. The virtual data channels may be arbitrated independently. The hybrid fabric is scalable and can support multiple topologies, including multiple stacked integrated circuit topologies.

First claim

Opening claim text (preview).

What is claimed is: 1. A heterogeneous three-dimensional circuit stack comprising: a central processing unit (CPU); a graphics processing unit (GPU) stacked with the CPU, the GPU communicatively coupled with the CPU through one or more through-silicon-vias (TSVs); and a fabric interconnect including interconnect logic to communicatively couple the CPU and the GPU with a shared resource, wherein the interconnect logic is to enable coherent access to the shared resource for execution threads of the GPU, wherein to enable coherent access to the shared resource, the interconnect logic is to route traffic that originates from a single execution thread of the GPU within a single traffic class. 2. The heterogeneous three-dimensional circuit stack as in claim 1 , wherein the fabric interconnect additionally includes bandwidth sharing logic to adjust bandwidth to the shared resource. 3. The heterogeneous three-dimensional circuit stack as in claim 2 , the bandwidth sharing logic to adjust bandwidth to the shared resource over a virtual channel including multiple programmatically pre-assigned traffic classifications. 4. The heterogeneous three-dimensional circuit stack as in claim 1 , wherein the shared resource includes memory to cache data to be received via the fabric interconnect. 5. The heterogeneous three-dimensional circuit stack as in claim 1 , wherein the shared resource is a shared memory resource including dynamic random-access memory. 6. The heterogeneous three-dimensional circuit stack as in claim 5 , wherein the fabric interconnect is a multi-channel fabric interconnect. 7. The heterogeneous three-dimensional circuit stack as in claim 5 , wherein the shared memory resource includes non-volatile memory. 8. The heterogeneous three-dimensional circuit stack as in claim 1 , wherein the interconnect logic is to operate at a higher frequency than one or more of the CPU and the GPU. 9. A method of interconnecting a heterogeneous three-dimensional circuit stack, the method comprising: communicatively coupling a central processing unit (CPU) to a graphics processing unit (GPU) through one or more through-silicon-vias (TSVs), the CPU vertically stacked with the GPU in the heterogeneous three-dimensional circuit stack, the CPU and the GPU communicatively coupled to a shared resource via a fabric interconnect, the CPU and the GPU coupled with the fabric interconnect via corresponding on-chip interconnects in each of the CPU and the GPU, and the fabric interconnect includes interconnect logic to enable coherent access to the shared resource; and enabling coherent access to the shared resource for execution threads of the GPU via programmatically pre-assigned traffic classifications, wherein coherent access to the shared resource is enabled by routing traffic originating from a single execution thread of the GPU within a single traffic class. 10. The method as in claim 9 , additionally comprising configuring bandwidth sharing logic to adjust bandwidth to the shared resource. 11. The method as in claim 10 , additionally comprising configuring memory to cache data received via the interconnect logic. 12. The method as in claim 9 , additionally comprising communicatively coupling an additional processor to the interconnect logic, the additional processor including an accelerator or an additional GPU, wherein the additional processor is vertically stacked with the GPU. 13. A system comprising: a heterogeneous three-dimensional circuit stack including a central processing unit (CPU) vertically stacked and communicatively coupled with a graphics processing unit (GPU) through one or more through-silicon-vias (TSVs); and a fabric interconnect including interconnect logic to communicatively couple the CPU and the GPU with a shared resource, wherein the interconnect logic is to enable coherent access to the shared resource for execution threads of the GPU, wherein to enable coherent access to the shared resource, the interconnect logic is to route traffic that originates from a single execution thread of the GPU within a single traffic class. 14. The system as in claim 13 , wherein the fabric interconnect additionally includes bandwidth sharing logic to adjust bandwidth to the shared resource. 15. The system as in claim 14 , the bandwidth sharing logic to adjust bandwidth to the shared resource over a virtual channel including multiple programmatically pre-assigned traffic classifications. 16. The system as in claim 13 , wherein the shared resource includes memory to cache data to be received via the fabric interconnect. 17. The system as in claim 13 , wherein the shared resource is a shared memory resource including dynamic random-access memory. 18. The system as in claim 17 , wherein the fabric interconnect is a multi-channel fabric interconnect. 19. The system as in claim 17 , wherein the shared memory resource includes non-volatile memory. 20. The system as in claim 13 , wherein the interconnect logic is to operate at a higher frequency than one or more of the CPU and the GPU.

Assignees

Inventors

Classifications

  • Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title

  • General purpose rendering architectures · CPC title

  • involving image processing hardware · CPC title

  • G06T1/20Primary

    Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • Memory management · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10346946B2 cover?
In on embodiment, a hybrid fabric interconnects multiple graphics processor cores within a processor. The hybrid fabric interconnect includes multiple data channels, including programmable virtual data channels. The virtual data channels carry multiple traffic classes of packet-based messages. The virtual data channels and multiple traffic classes may be assigned one of multiple priorities. The…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 09 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).