Multi-architecture execution graphs

US2023083345A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2023083345-A1
Application numberUS-202117468128-A
CountryUS
Kind codeA1
Filing dateSep 7, 2021
Priority dateSep 7, 2021
Publication dateMar 16, 2023
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Apparatuses, systems, and techniques to perform multi-architecture execution graphs. In at least one embodiment, a parallel processing platform, such as compute uniform device architecture (CUDA) generates multi-architecture execution graphs comprising a plurality of software kernels to be performed by one or more processor cores having one or more processor architectures.

First claim

Opening claim text (preview).

What is claimed is: 1 . A processor comprising: one or more circuits to cause two or more different types of processing cores to perform an inferencing operation using one or more neural networks. 2 . The processor of claim 1 , wherein the two or more different types of processing cores comprise one or more deep learning accelerators (DLAs) and one or more parallel processing unit (PPU) cores. 3 . The processor of claim 2 , wherein the one or more PPU cores are graphics processing unit (GPU) cores. 4 . The processor of claim 1 , wherein one or more software programs comprise instructions to cause the two or more different types of processing cores to perform the inferencing operation, the one or more software programs comprising a first set of instructions to be performed by a first of the two or more different types of processing cores and a second set of instructions to be performed by a second of the two or more different types of processing cores. 5 . The processor of claim 1 , wherein the inferencing operation is to be performed as a result of one or more function calls to a parallel processing library, the parallel processing library comprising instructions to perform a first portion of the inferencing operation on a first of the two or more different types of processing cores and a second portion of the inferencing operation on a second of the two or more different types of processing cores. 6 . The processor of claim 1 , wherein the inferencing operation is to be performed as a result of one or more function calls to a parallel processing library to at least indicate the one or more neural networks, the parallel processing library providing shared pointer addressing to the two or more different types of processing cores to perform the inferencing operation. 7 . A processor comprising: one or more circuits to use graph code to cause a software program to be performed by two or more different types of processing cores. 8 . The processor of claim 7 , wherein the two or more different types of processing cores comprise one or more deep learning accelerators (DLAs) and one or more parallel processing unit (PPU) cores. 9 . The processor of claim 7 , wherein the graph code indicates an execution graph generated by a parallel processing library. 10 . The processor of claim 7 , wherein the graph code is to cause the software program to perform one or more inferencing operations. 11 . The processor of claim 7 , wherein the software program comprises a set of instructions and the graph code comprises a first subset of the set of instructions and a second subset of the set of instructions, the first subset to be performed by a first of the two or more different types of processing cores and a second subset to be performed by a second of the two or more different types of processing cores. 12 . The processor of claim 7 , wherein a parallel processing library generates the graph code as a result of one or more function calls to an interface provided by said parallel processing library and the parallel processing library comprises a first set of instructions to generate one or more software kernels for a first of the two or more different types of processor cores and a second set of instructions to generate one or more software kernels for a second of the two or more different types of processor cores. 13 . The processor of claim 7 , wherein the graph code comprises at least a first software kernel to be executed by a first of the two or more different types of processor cores and a second kernel to be executed by a second of the two or more different types of processor cores. 14 . A machine-readable medium having stored thereon one or more instructions, which if performed by one or more processors, cause the one or more processors to at least: cause two or more different types of processing cores to perform an inferencing operation using one or more neural networks. 15 . The machine-readable medium of claim 14 , wherein the two or more different types of processing cores comprise at least one or more parallel processing unit (PPU) cores and one or more deep learning accelerators (DLAs). 16 . The machine-readable medium of claim 15 , wherein the one or more PPU cores are graphics processing unit (GPU) cores. 17 . The machine-readable medium of claim 14 , wherein the two or more different types of processing cores are to perform an execution graph, the execution graph comprising a first kernel to perform a first part of the inferencing operation and a second kernel to perform a second part of the inferencing operation. 18 . The machine-readable medium of claim 14 , further comprising instructions that, when performed by the one or more processors, cause the one or more processors to perform a software program comprising instructions to cause the two or more different types of processing cores to perform the inferencing operation, the software program comprising a first set of instructions to be performed by a first of the two or more different types of processing cores and a second set of instructions to be performed by a second of the two or more different types of processing cores. 19 . The machine-readable medium of claim 14 , further comprising instructions that, when performed by the one or more processors, cause the one or more processors to receive the one or more neural networks as a result of one or more function calls to a parallel processing library, the parallel processing library comprising a first set of instructions to cause a first part of the inferencing operation to be performed by a first of the two or more different types of processing cores and a second set of instructions to cause a second part of the inferencing operation to be performed by a second of the two or more different types of processing cores. 20 . The machine-readable medium of claim 14 , further comprising instructions that, when performed by the one or more processors, cause the two or more different types of processing cores to perform the inferencing operation as a result of one or more function calls to a parallel processing library. 21 . The machine-readable medium of claim 20 , wherein one or more function calls to an application programming interface (API) provided by the parallel processing library are to indicate the one or more neural networks. 22 . A machine-readable medium having stored thereon one or more instructions, which if performed by one or more processors, cause the one or more processors to at least: use graph code to cause a software program to be performed by two or more different types of processing cores. 23 . The machine-readable medium of claim 22 , wherein the two or more different types of processing cores comprise one or more deep learning accelerators (DLAs) and one or more graphics processing unit (GPU) cores. 24 . The machine-readable medium of claim 22 , wherein the graph code is to cause the software program to perform one or more inferencing operations using one or more neural networks. 25 . The machine-readable medium of claim 22 , wherein the graph code indicates an execution graph generated by a parallel processing library as a result of one or more function calls to the parallel processing library indicating the software program to be performed by the two or more different types of processing cores. 26 . The machine-readable medium of claim 2

Assignees

Inventors

Classifications

  • Interfaces, programming languages or software development kits, e.g. for simulating neural networks · CPC title

  • using electronic means · CPC title

  • Combinations of networks · CPC title

  • Array of vector units · CPC title

  • Interprogram communication · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023083345A1 cover?
Apparatuses, systems, and techniques to perform multi-architecture execution graphs. In at least one embodiment, a parallel processing platform, such as compute uniform device architecture (CUDA) generates multi-architecture execution graphs comprising a plurality of software kernels to be performed by one or more processor cores having one or more processor architectures.
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06N5/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Mar 16 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).