Systolic arithmetic on sparse data
US-2023377209-A1 · Nov 23, 2023 · US
US2023083345A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2023083345-A1 |
| Application number | US-202117468128-A |
| Country | US |
| Kind code | A1 |
| Filing date | Sep 7, 2021 |
| Priority date | Sep 7, 2021 |
| Publication date | Mar 16, 2023 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Apparatuses, systems, and techniques to perform multi-architecture execution graphs. In at least one embodiment, a parallel processing platform, such as compute uniform device architecture (CUDA) generates multi-architecture execution graphs comprising a plurality of software kernels to be performed by one or more processor cores having one or more processor architectures.
Opening claim text (preview).
What is claimed is: 1 . A processor comprising: one or more circuits to cause two or more different types of processing cores to perform an inferencing operation using one or more neural networks. 2 . The processor of claim 1 , wherein the two or more different types of processing cores comprise one or more deep learning accelerators (DLAs) and one or more parallel processing unit (PPU) cores. 3 . The processor of claim 2 , wherein the one or more PPU cores are graphics processing unit (GPU) cores. 4 . The processor of claim 1 , wherein one or more software programs comprise instructions to cause the two or more different types of processing cores to perform the inferencing operation, the one or more software programs comprising a first set of instructions to be performed by a first of the two or more different types of processing cores and a second set of instructions to be performed by a second of the two or more different types of processing cores. 5 . The processor of claim 1 , wherein the inferencing operation is to be performed as a result of one or more function calls to a parallel processing library, the parallel processing library comprising instructions to perform a first portion of the inferencing operation on a first of the two or more different types of processing cores and a second portion of the inferencing operation on a second of the two or more different types of processing cores. 6 . The processor of claim 1 , wherein the inferencing operation is to be performed as a result of one or more function calls to a parallel processing library to at least indicate the one or more neural networks, the parallel processing library providing shared pointer addressing to the two or more different types of processing cores to perform the inferencing operation. 7 . A processor comprising: one or more circuits to use graph code to cause a software program to be performed by two or more different types of processing cores. 8 . The processor of claim 7 , wherein the two or more different types of processing cores comprise one or more deep learning accelerators (DLAs) and one or more parallel processing unit (PPU) cores. 9 . The processor of claim 7 , wherein the graph code indicates an execution graph generated by a parallel processing library. 10 . The processor of claim 7 , wherein the graph code is to cause the software program to perform one or more inferencing operations. 11 . The processor of claim 7 , wherein the software program comprises a set of instructions and the graph code comprises a first subset of the set of instructions and a second subset of the set of instructions, the first subset to be performed by a first of the two or more different types of processing cores and a second subset to be performed by a second of the two or more different types of processing cores. 12 . The processor of claim 7 , wherein a parallel processing library generates the graph code as a result of one or more function calls to an interface provided by said parallel processing library and the parallel processing library comprises a first set of instructions to generate one or more software kernels for a first of the two or more different types of processor cores and a second set of instructions to generate one or more software kernels for a second of the two or more different types of processor cores. 13 . The processor of claim 7 , wherein the graph code comprises at least a first software kernel to be executed by a first of the two or more different types of processor cores and a second kernel to be executed by a second of the two or more different types of processor cores. 14 . A machine-readable medium having stored thereon one or more instructions, which if performed by one or more processors, cause the one or more processors to at least: cause two or more different types of processing cores to perform an inferencing operation using one or more neural networks. 15 . The machine-readable medium of claim 14 , wherein the two or more different types of processing cores comprise at least one or more parallel processing unit (PPU) cores and one or more deep learning accelerators (DLAs). 16 . The machine-readable medium of claim 15 , wherein the one or more PPU cores are graphics processing unit (GPU) cores. 17 . The machine-readable medium of claim 14 , wherein the two or more different types of processing cores are to perform an execution graph, the execution graph comprising a first kernel to perform a first part of the inferencing operation and a second kernel to perform a second part of the inferencing operation. 18 . The machine-readable medium of claim 14 , further comprising instructions that, when performed by the one or more processors, cause the one or more processors to perform a software program comprising instructions to cause the two or more different types of processing cores to perform the inferencing operation, the software program comprising a first set of instructions to be performed by a first of the two or more different types of processing cores and a second set of instructions to be performed by a second of the two or more different types of processing cores. 19 . The machine-readable medium of claim 14 , further comprising instructions that, when performed by the one or more processors, cause the one or more processors to receive the one or more neural networks as a result of one or more function calls to a parallel processing library, the parallel processing library comprising a first set of instructions to cause a first part of the inferencing operation to be performed by a first of the two or more different types of processing cores and a second set of instructions to cause a second part of the inferencing operation to be performed by a second of the two or more different types of processing cores. 20 . The machine-readable medium of claim 14 , further comprising instructions that, when performed by the one or more processors, cause the two or more different types of processing cores to perform the inferencing operation as a result of one or more function calls to a parallel processing library. 21 . The machine-readable medium of claim 20 , wherein one or more function calls to an application programming interface (API) provided by the parallel processing library are to indicate the one or more neural networks. 22 . A machine-readable medium having stored thereon one or more instructions, which if performed by one or more processors, cause the one or more processors to at least: use graph code to cause a software program to be performed by two or more different types of processing cores. 23 . The machine-readable medium of claim 22 , wherein the two or more different types of processing cores comprise one or more deep learning accelerators (DLAs) and one or more graphics processing unit (GPU) cores. 24 . The machine-readable medium of claim 22 , wherein the graph code is to cause the software program to perform one or more inferencing operations using one or more neural networks. 25 . The machine-readable medium of claim 22 , wherein the graph code indicates an execution graph generated by a parallel processing library as a result of one or more function calls to the parallel processing library indicating the software program to be performed by the two or more different types of processing cores. 26 . The machine-readable medium of claim 2
Interfaces, programming languages or software development kits, e.g. for simulating neural networks · CPC title
using electronic means · CPC title
Combinations of networks · CPC title
Array of vector units · CPC title
Interprogram communication · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.