Translating text encodings of machine learning models to executable code
US-11210073-B1 · Dec 28, 2021 · US
US12056494B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12056494-B2 |
| Application number | US-202117239376-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 23, 2021 |
| Priority date | Apr 23, 2021 |
| Publication date | Aug 6, 2024 |
| Grant date | Aug 6, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Apparatuses, systems, and techniques to identify instructions for advanced execution. In at least one embodiment, a processor performs one or more instructions that have been identified by a compiler to be speculatively performed in parallel.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: one or more circuits to perform a compiler, wherein the compiler is to cause graphics processing unit (GPU) code to be speculatively performed based, at least in part, on whether two or more central processing unit (CPU) code branch conditions are dependent on one another. 2. The processor of claim 1 , wherein one or more instructions of the GPU code have been identified to be speculatively performed in parallel by the compiler based, at least in part, on identifying copy operations, and the one or more circuits are to cause one or more GPUs to perform the identified one or more instructions based, at least in part, on receiving a command from another processor. 3. The processor of claim 1 , wherein one or more instructions of the GPU code have been identified to be speculatively performed in parallel by the compiler based, at least in part, on identifying copy operations between a parallel processing unit and a host computer system, and labeling safe operations following one or more identified copy operations. 4. The processor of claim 1 , wherein one or more instructions of the GPU code include extended live ranges for variables used by operations associated with instructions identified by the compiler to be speculatively performed in parallel. 5. The processor of claim 1 , wherein the one or more circuits are to cause one or more GPUs to perform the GPU code after receiving one or more kernel launch commands from a host computer system. 6. The processor of claim 1 , wherein the GPU code is part of a while loop. 7. The processor of claim 1 , wherein the GPU code is in a set of instructions that follows a first value of a branch condition, and that does not follow a second value of the branch condition. 8. A system, comprising: one or more processors to perform a compiler, wherein the compiler is to cause graphics processing unit (GPU) code to be speculatively performed based, at least in part, on whether two or more central processing unit (CPU) code branch conditions are dependent on one another; and one or more memories to store the GPU code. 9. The system of claim 8 , wherein one or more instructions of the GPU code have been identified to be speculatively performed in parallel by the compiler based, at least in part, on identifying copy operations to a host computer system. 10. The system of claim 8 , wherein one or more instructions of the GPU code have been identified to be speculatively performed in parallel by the compiler based, at least in part, on finding one or more conditional branches in a representation of a computer program that uses a neural network. 11. The system of claim 8 , wherein the one or more processors are to launch the GPU code for performance by one or more GPUs. 12. The system of claim 8 , wherein the one or more processors are a first one or more processors, the system further comprises a second one or more processors to speculatively launch one or more instructions of the GPU code for performance by one or more GPUs, and the second one or more processors are to stop launching instructions speculatively in response to receiving a value via a copy operation that satisfies a condition preceding the one or more instructions in a representation of a computer program. 13. The system of claim 8 , wherein instructions have been identified to be speculatively performed in parallel by the compiler based, at least in part, on labeling operations that are safe to be speculatively performed. 14. The system of claim 8 , wherein instructions have been identified to be speculatively performed in parallel by the compiler based, at least in part, on searching a representation of a computer program for copy operations, and identifying operations that follow the copy operations that are safe to be speculatively performed. 15. The system of claim 8 , wherein the GPU code is part of a while loop that implements a portion of an inferencing operation using a neural network. 16. A method, comprising: performing, by one or more processors, a compiler, wherein the compiler is to cause graphics processing unit (GPU) code to be speculatively performed based, at least in part, on whether two or more central processing unit (CPU) code branch conditions are dependent on one another. 17. The method of claim 16 , wherein the GPU code has been identified to be speculatively performed by the compiler based, at least in part, on identifying operations that do not change a random state, overwrite outputs, use a signal instruction, or use a wait instruction. 18. The method of claim 16 , wherein the GPU code has been identified to be speculatively performed by the compiler based, at least in part, on identifying a conditional branch and selecting a path from a plurality of paths following the conditional branch. 19. The method of claim 16 , wherein one or more of the GPU code has been identified to be speculatively performed by the compiler based, at least in part, on identifying copy operations. 20. The method of claim 16 , wherein the GPU code includes extended live ranges for variables used in speculatively performed operations. 21. The method of claim 16 , wherein the compiler is to identify the GPU code to be speculatively performed based, at least in part, on identifying copy operations, and wherein the GPU code implements a portion of an inferencing operation using a neural network. 22. A non-transitory machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least: perform a compiler, wherein the compiler is to cause graphics processing unit (GPU) code to be speculatively performed based, at least in part, on whether two or more central processing unit (CPU) code branch conditions are dependent on one another. 23. The non-transitory machine-readable medium of claim 22 , wherein the set of instructions, which if performed by the one or more processors, further cause the one or more processors to perform the compiler to at least identify one or more of instructions of the GPU code to be speculatively performed in parallel based, at least in part, on identifying copy operations between a parallel processing unit and a host computer system in a representation of a computer program. 24. The non-transitory machine-readable medium of claim 22 , wherein the set of instructions, which if performed by the one or more processors, further cause the one or more processors to perform the compiler to at least identify operations following a copy operation that are safe to execute. 25. The non-transitory machine-readable medium of claim 22 , wherein the set of instructions, which if performed by the one or more processors, further cause the one or more processors to at least perform the compiler to label operations that are safe to be speculatively performed. 26. The non-transitory machine-readable medium of claim 22 , wherein the set of instructions, which if performed by the one or more processors, further cause the one or more processors to perform the compiler to at least label operations that are safe to be speculatively performed and extend a live range of variables associated with operations labeled safe to be speculatively performed. 27. The non-transitory machine-readable medium of claim 22 , wherein the set of instructions, which if performed by the one or
Supervised learning · CPC title
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
from multiple instruction streams, e.g. multistreaming · CPC title
controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.