Data distribution fabric in scalable GPUs

US10580109B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10580109-B2
Application numberUS-201916417899-A
CountryUS
Kind codeB2
Filing dateMay 21, 2019
Priority dateJun 30, 2014
Publication dateMar 3, 2020
Grant dateMar 3, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment provides for a processor comprising a three-dimensional (3D) integrated circuit stack including multiple graphics processor cores and interconnect logic to interconnect the graphics processor cores of the 3D integrated circuit stack to enable data distribution between the graphics processor cores over a virtual channel including multiple programmatically pre-assigned traffic classifications.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor comprising: a three-dimensional (3D) integrated circuit stack including multiple graphics processor cores; and interconnect logic to interconnect the graphics processor cores of the 3D integrated circuit stack to enable data distribution between the graphics processor cores over a virtual channel including multiple programmatically pre-assigned traffic classifications. 2. The processor of claim 1 , wherein the virtual channel is transmitted over at least one physical data channel, the interconnect logic includes multiple data channels, and each of the multiple data channels is a separately clock gated bus. 3. The processor of claim 2 , wherein each bus is to use early indications to signal incoming activity. 4. The processor of claim 1 , wherein the interconnect logic is to couple the graphics processor cores to a shared resource and enable coherent access to the shared resource for execution threads of the graphics processor cores, wherein to enable coherent access to the shared resource the interconnect logic is to route traffic that originates from a single execution thread of the graphics processor cores within a single traffic classification. 5. The processor of claim 4 , wherein the shared resource is a shared memory or a shared cache. 6. The processor of claim 5 , wherein the interconnect logic is to enable the data distribution over multiple virtual channels, the multiple virtual channels including the virtual channel and one or more additional channels. 7. The processor of claim 6 , wherein the multiple virtual channels are to be arbitrated based on a programmable priority system, at least one virtual channel is to be assigned multiple traffic classifications, and each of the multiple traffic classifications has a programmable priority. 8. The processor of claim 7 , wherein the programmable priority is relative to traffic classifications within a same virtual channel of the multiple virtual channels. 9. The processor of claim 1 , wherein the interconnect logic operates at a higher frequency than the graphics processor cores. 10. The processor of claim 1 , additionally including a set of interconnect nodes to couple the graphics processor cores with the interconnect logic, wherein the interconnect logic includes multiple data channels and the set of interconnect nodes is configured to switch data between the multiple data channels when transiting one of the graphics processor cores. 11. A graphics processor device comprising: a system interface bus; a processor including a three-dimensional (3D) circuit stack including a plurality of graphics processor cores coupled via interconnect logic having at least one clock gated physical data channel and a set of virtual channels including one or more virtual channels, the one or more virtual channels having multiple programmatically pre-assigned traffic classifications; and memory coupled to the interconnect logic and at least one graphics processor core of the plurality of graphics processor cores, the memory to store data for the at least one graphics processor core before transmission via the interconnect logic. 12. The graphics processor device as in claim 11 , wherein the plurality of graphics processor cores couple with a shared resource on the processor via the interconnect logic, wherein the shared resource on the processor is a shared memory resource. 13. The graphics processor device as in claim 12 , wherein the shared memory resource includes a shared cache memory. 14. The graphics processor device of claim 11 , wherein the set of virtual channels includes multiple virtual channels, the multiple virtual channels in the set of virtual channels are to be arbitrated based on a programmable priority system, and each of the multiple programmable traffic classifications is prioritized relative to other traffic classes assigned to a same virtual channel of the multiple virtual channels. 15. The graphics processor device of claim 11 , wherein the graphics processor device is a graphics processor card. 16. A method comprising: determining a channel access status on a multiple node shared bus for a message from a source node to a target node, wherein at least one node of the multiple node shared bus couples with a graphics processor core of an integrated circuit and at least one node of the multiple node shared bus couples with a shared resource on the integrated circuit; transmitting a message from the source node to the target node over a first data channel, wherein the message is associated with a first traffic classification having a first priority; receiving the message at a first data bus connector coupled with the graphics processor core; and based on at least the source node and the target node, switching the message from the first data channel to a second data channel. 17. The method of claim 16 , additionally including determining that the message is associated with the first traffic classification and switching the message from the first data channel to the second data channel based at least on the first traffic classification. 18. The method of claim 16 , wherein determining the channel access status comprises: determining, using a channel access protocol, if a third data channel is available to transmit the message from the source node to the target node; and after determining that transmission over the third data channel is blocked, transmitting the message over the first data channel. 19. The method of claim 18 , wherein the first, second, and third data channel are virtual data channels. 20. The method of claim 19 , wherein the channel access protocol is a time division multiple access protocol or a carrier sense multiple access protocol.

Assignees

Inventors

Classifications

  • using a plugboard for programming · CPC title

  • Memory management · CPC title

  • Volume rendering · CPC title

  • G06T1/20Primary

    Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • Integrated on microchip, e.g. switch-on-chip · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10580109B2 cover?
One embodiment provides for a processor comprising a three-dimensional (3D) integrated circuit stack including multiple graphics processor cores and interconnect logic to interconnect the graphics processor cores of the 3D integrated circuit stack to enable data distribution between the graphics processor cores over a virtual channel including multiple programmatically pre-assigned traffic clas…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 03 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).