Sparse convolutional neural network accelerator
US-10891538-B2 · Jan 12, 2021 · US
US11748298B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11748298-B2 |
| Application number | US-202217826674-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 27, 2022 |
| Priority date | Apr 9, 2017 |
| Publication date | Sep 5, 2023 |
| Grant date | Sep 5, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An integrated circuit (IC) package apparatus is disclosed. The IC package includes one or more processing units and a bridge, mounted below the one or more processing unit, including one or more arithmetic logic units (ALUs) to perform atomic operations.
Opening claim text (preview).
What is claimed is: 1. An integrated circuit (IC) package comprising: a plurality of graphics processing units (GPUs) including at least a first set of one or more GPUs and a second set of one or more GPUs; a fixed function GPU including shared function circuitry for the plurality of GPUs, the fixed function GPU being a separate processor unit from the plurality of GPUs; and a plurality of channels, the fixed function GPU being coupled to each of the first set of one or more GPUs and the second set of one or more GPUs by a respective channel of the plurality of channels; wherein the shared function circuitry of the fixed function GPU enables coupling between at least the first set of one or more GPUs and the second set of one or more GPUs via channels of the plurality of channels. 2. The IC package of claim 1 , wherein the shared function circuitry of the fixed function GPU includes one or more of: shared memory; shared cache; and shared fixed functions. 3. The IC package of claim 1 , further comprising memory, wherein each of the first set of one or more GPUs and the second set of one or more GPUs is coupled with a respective memory for storage of data. 4. The IC package of claim 1 , wherein the plurality of channels includes a plurality of virtual channels. 5. The IC package of claim 1 , wherein each of the plurality of channels includes a plurality of separate physical connections between one or more GPUs of the plurality of GPUs and the fixed function GPU. 6. The IC package of claim 1 , wherein the plurality of channels further includes a channel between the first set of one or more GPUs and the second set of one or more GPUs. 7. The IC package of claim 1 , wherein the plurality of GPUs comprises an off-die compute cluster. 8. The IC package of claim 1 , wherein each of the first set of one or more GPUs and the second set of one or more GPUs is defined as an independent block. 9. The IC package of claim 1 , wherein a configuration of the IC package may be adjusted based on changing demand for the plurality of GPUs. 10. A system comprising: a processor integrated circuit (IC) including one or more processors; a memory IC including memory for storage of data; and a graphics processing unit (GPU) package including: a plurality of GPUs including at least a first set of one or more GPUs and a second set of one or more GPUs, a fixed function GPU including shared function circuitry for the plurality of GPUs, the fixed function GPU being a separate processor unit from the plurality of GPUs, and a plurality of channels, the fixed function GPU being coupled to each of the first set of one or more GPUs and the second set of one or more GPUs by a respective channel of the plurality of channels; wherein the shared function circuitry of the fixed function GPU enables coupling between at least the first set of one or more GPUs and the second set of one or more GPUs via channels of the plurality of channels. 11. The system of claim 10 , wherein the shared function circuitry of the fixed function GPU includes one or more of: shared memory; shared cache; and shared fixed functions. 12. The system of claim 10 , wherein the GPU package further includes memory, wherein each of the first set of one or more GPUs and the second set of one or more GPUs is coupled with a respective memory for storage of data. 13. The system of claim 10 , wherein the plurality of channels includes a plurality of virtual channels. 14. The system of claim 10 , wherein each of the plurality of channels includes a plurality of separate physical connections between one or more GPUs of the plurality of GPUs and the fixed function GPU. 15. The system of claim 10 , wherein the plurality of channels further includes a channel between the first set of one or more GPUs and the second set of one or more GPUs. 16. One or more non-transitory computer-readable storage mediums having stored thereon executable computer program instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: receiving a request including transfer of data between a first graphics processing unit (GPU) and a second GPU of a plurality of GPUs of a compute cluster, the compute cluster being included in an integrated circuit (IC) package; and transferring the data from the first GPU via a first channel of a plurality of channels to a fixed function GPU and from the fixed function GPU via a second channel of the plurality of channels to the second GPU, the fixed function GPU including shared function circuitry, the fixed function GPU being a separate processor unit from the plurality of GPUs; wherein the shared function circuitry of the fixed function GPU enables coupling between at least the first GPU and the second GPU in the compute cluster via channels of the plurality of channels. 17. The one or more non-transitory storage mediums of claim 16 , wherein the shared function circuitry of the fixed function GPU includes one or more of: shared memory; shared cache; and shared fixed functions. 18. The one or more non-transitory storage mediums of claim 16 , wherein the IC package further includes memory, the first GPU being computed to a first memory and the second GPU being coupled with a second memory. 19. The one or more non-transitory storage mediums of claim 16 , further comprising executable computer program instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: receiving a request including transfer of a second data between the first graphics GPU and the second GPU; and transferring the second data from the first GPU via a third channel of the plurality of channels directly to the second GPU. 20. The one or more non-transitory storage mediums of claim 16 , further comprising executable computer program instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: determining that a change in demand for the plurality of GPUs has occurred; and adjusting a configuration of the IC package in response to the change in demand for the plurality of GPUs.
Recurrent networks, e.g. Hopfield networks · CPC title
Combinations of networks · CPC title
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.