United states graphics processor techniques with split between workload distribution control data on shared control bus and corresponding graphics data on memory interfaces

US11847489B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11847489-B2
Application numberUS-202117158943-A
CountryUS
Kind codeB2
Filing dateJan 26, 2021
Priority dateJan 26, 2021
Publication dateDec 19, 2023
Grant dateDec 19, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are disclosed relating to a shared control bus for communicating between primary control circuitry and multiple distributed graphics processor units. In some embodiments, a set of multiple processor units includes first and second graphics processors, where the first and second graphics processors are coupled to access graphics data via respective memory interfaces. A shared workload distribution bus is used to transmit control data that specifies graphics work distribution to the multiple graphics processing units. The shared workload distribution bus may be arranged in a chain topology, e.g., to connect the workload distribution circuitry to the first graphics processor and connect the first graphics processor to the second graphics processor such that the workload distribution circuitry communicates with the second graphics processor via the shared workload distribution bus connection to the first graphics processor. Disclosed techniques may facilitate graphics work distribution for a scalable number of processors.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus, comprising: a set of multiple graphics processor units including at least first and second graphics processors, wherein the first and second graphics processors are coupled to access graphics data via respective memory interfaces; a shared workload distribution bus; and workload distribution circuitry configured to transmit, via the shared workload distribution bus, control data that specifies graphics work distribution to the multiple graphics processor units, wherein the graphics processor units are also configured to transmit control information to the workload distribution circuitry via the shared workload distribution bus; wherein one or more of the graphics processor units are configured to retrieve first graphics data via one or more of the respective memory interfaces based on the control data received via the shared workload distribution bus, wherein the control data specifies respective sets of one or more workgroups of a compute kernel, wherein the one or more workgroups include instructions that the graphics processor units are configured to execute to operate on the first graphics data; and wherein the shared workload distribution bus: connects the workload distribution circuitry to the first graphics processor and connects the first graphics processor to the second graphics processor such that the workload distribution circuitry is configured to transmit the control data to the second graphics processor via the shared workload distribution bus connection to the first graphics processor, and is configured to implement flow control using a credit management system, wherein packets that communicate credit information are distinct from packets that communicate control data. 2. The apparatus of claim 1 , wherein the first and the second graphics processors include respective arbitration circuitry configured to arbitrate between locally generated control data and control data generated by another processor unit connected to the shared workload distribution bus. 3. The apparatus of claim 2 , wherein the arbitration circuitry is configured to prioritize the control data from processor units that are further from the workload distribution circuitry on the shared workload distribution bus over the locally generated control data. 4. The apparatus of claim 1 , wherein the shared workload distribution bus provides point-to-point communications between the workload distribution circuitry and clients of the graphics processor units and is configured to aggregate multiple distinct requests from a client into a single packet of control data. 5. The apparatus of claim 4 , wherein the shared workload distribution bus is configured to arbitrate both among requests to be aggregated into a packet and among clients submitting requests to communicate with the workload distribution circuitry. 6. The apparatus of claim 1 , wherein circuitry of the shared workload distribution bus between the first and second graphics processors includes both source synchronous and synchronous communications circuitry and wherein the apparatus is configured to use one of the source synchronous and synchronous communications circuitry based on a strap signal. 7. The apparatus of claim 1 , wherein the first and the second graphics processors are located in different semiconductor substrates. 8. The apparatus of claim 1 , wherein the first and the second graphics processors are included in different power and clock domains. 9. The apparatus of claim 1 , wherein: the packets that communicate credit information include: a credit count field, a client identifier, and a graphics processor unit identifier; and nodes of the shared workload distribution bus include respective credit distribution circuitry configured to route packets that communicate credit information to clients and to other graphics processor units. 10. The apparatus of claim 1 , wherein the shared workload distribution bus supports both packets that target a single processor unit and packets that target multiple processor units. 11. The apparatus of claim 1 , wherein the multiple graphics processor units are arranged along the shared workload distribution bus according to a serial topology such that each processor unit connected to the shared workload distribution bus is directly connected to at most two other processor units via the shared workload distribution bus. 12. The apparatus of claim 1 , wherein a first portion of the shared workload distribution bus includes a first number of wires configured to transmit data in parallel and a second portion of the shared workload distribution bus includes a second number of wires configured to transmit data in parallel; wherein the apparatus further comprises downsize circuitry configured to split a packet transmitted by the first portion of the shared workload distribution bus into multiple packets for transmission by the second portion of the shared workload distribution bus. 13. The apparatus of claim 1 , where a node of the shared workload distribution bus for one of the graphics processor units includes: an input switch for control data from a corresponding graphics processor unit; an output switch for control data for the corresponding graphics processor unit; packet switches configured to receive packets from other processor units on the shared workload distribution bus; and a direction register configured to store an indication of a direction to the workload distribution circuitry via the shared workload distribution bus. 14. The apparatus of claim 1 , wherein the shared workload distribution bus provides ordering of packets between pairs of processor units connected to the shared workload distribution bus. 15. The apparatus of claim 1 , wherein the first and second graphics processors include separate respective: fragment generator circuitry; shader core circuitry; memory system circuitry that includes a data cache and a memory management unit; geometry processing circuitry; and distributed workload distribution circuitry. 16. The apparatus of claim 1 , wherein the multiple graphics processor units are arranged along the shared workload distribution bus according to at least two groups, connected by distribution center circuitry located between graphics processor units in ones of the at least two groups. 17. The apparatus of claim 16 , wherein the distribution center circuitry included in at least two different groups is configured to communicate between the two different groups via a communications fabric that is shared with non-control data. 18. A method comprising: transmitting, by workload distribution circuitry via a shared workload distribution bus to a first graphics processor unit of multiple graphics processor units, control data that specifies graphics work distribution, wherein the control data specifies respective sets of one or more workgroups of a compute kernel; transmitting, from ones of the multiple graphics processor units to the workload distribution circuitry, control information via the shared workload distribution bus; retrieving, by one or more of the graphics processor units, via respective memory interfaces based on the control data, first graphics data; executing, by one or more of the graphics processor units, instructions one or more of the workgroups to operate on the first graphics data; and providing, by the shared workload distribution bus, a credit management system, wherein packets that communicate credit information are distinct from packets that communicate control data wherei

Assignees

Inventors

Classifications

  • G06F9/4881Primary

    Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

  • using a secondary processor, e.g. coprocessor (peripheral processor G06F13/12) · CPC title

  • considering the load · CPC title

  • Buffers; Shared memory; Pipes · CPC title

  • where tasks reside in different layers, e.g. user- and kernel-space · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11847489B2 cover?
Techniques are disclosed relating to a shared control bus for communicating between primary control circuitry and multiple distributed graphics processor units. In some embodiments, a set of multiple processor units includes first and second graphics processors, where the first and second graphics processors are coupled to access graphics data via respective memory interfaces. A shared workload…
Who is the assignee on this patent?
Apple Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/4881. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 19 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).