Data-parallel probabilistic inference
US-2015095277-A1 · Apr 2, 2015 · US
US9632761B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9632761-B2 |
| Application number | US-201414153239-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 13, 2014 |
| Priority date | Jan 13, 2014 |
| Publication date | Apr 25, 2017 |
| Grant date | Apr 25, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, methods, and techniques of distributing a workload of an application to a GPU are provided. An example method includes obtaining an intermediate representation of a source code portion of an application and compiling the intermediate representation into a set of instructions that is native to the GPU. The set of instructions includes a binary representation of the source code portion executable on the GPU, and execution of the set of instructions on the GPU includes processing a workload of the application. The method also includes transforming data associated with the source code portion into one or more data types native to the GPU and sending to the GPU a communication including the set of instructions executable on the GPU and the one or more data types native to the GPU.
Opening claim text (preview).
We claim: 1. A method of distributing a workload of an application to a graphics processing unit (GPU), comprising: identifying an indication that an instantiation of an object-oriented class included in an application is executable on a GPU, wherein a source code portion of the application includes the class and is executable on the GPU, and wherein the indication is included in the application; obtaining an intermediate representation of the source code portion of the application, wherein the intermediate representation is machine-independent binary code; compiling the intermediate representation into a set of instructions that is native to the GPU, wherein a node provides a runtime environment for the intermediate representation and is executable on a heterogeneous plurality of platforms, the set of instructions includes a binary representation of the source code portion executable on the GPU, and execution of the set of instructions on the GPU includes processing a workload of the application; transforming data associated with the types native to the GPU; and sending to the GPU a communication including the set of instructions executable on the GPU and the one or more data types native to the GPU, wherein the communication causes the GPU to execute the set of instructions using the one or more data types native to the GPU. 2. The method of claim 1 , wherein the application is written in a high-level programming language, and wherein the data associated with the source code portion includes first data including one or more data types specific to the high-level programming language. 3. The method of claim 2 , wherein the transforming data includes passing the first data to a library and responsive to the passing, obtaining second data including one or more data types native to the GPU, and wherein a data type in the first data corresponds to a data type in the second data. 4. The method of claim 2 , further including: responsive to the communication, receiving a response including a result of the GPU-processed workload, wherein the result includes third data including one or more data types native to the GPU; transforming the third data into fourth data including one or more data types specific to the high-level programming language, wherein a data type in the third data corresponds to a data type in the fourth data; and sending the fourth data to the application. 5. The method of claim 1 , wherein the compiling the intermediate representation into a set of instructions includes generating a stream of commands for execution on the GPU, and wherein the sending includes sending to the GPU the stream of commands. 6. The method of claim 1 , wherein the application executes on a computing device that is coupled to a central processing unit (CPU) and the GPU. 7. The method of claim 6 , wherein the compiling includes before the application executes on the computing device, compiling the intermediate representation into the binary representation of the source code portion executable on the GPU. 8. The method of claim 6 , wherein the compiling includes while the application is executing on the computing device, compiling the intermediate representation into the binary representation of the source code portion executable on the GPU. 9. The method of claim 6 , wherein the application is executing on the computing device in parallel with the set of instructions executing on the GPU. 10. The method of claim 9 , wherein the data include a first set of data and a second set of data, wherein each of the first and second sets of data includes one or more data types specific to the application, wherein the transforming data includes transforming the one or more data types of the first set of data into one or more data types native to a first GPU and transforming the one or more data types of the second set of data into one or more data types native to a second GPU, and wherein the sending includes sending to the first GPU a first communication including the set of instructions executable on the first GPU and the one or more data types native to the first GPU and sending to the second GPU a second communication including the set of instructions executable on the second GPU and the one or more data types native to the second GPU. 11. The method of claim 10 , wherein the first GPU executes the set of instructions using the one or more transformed data types native to the first GPU in parallel with the second GPU executing the set of instructions using the one or more transformed data types native to the second GPU. 12. The method of claim 9 , wherein the GPU includes a first GPU core and a second GPU core and the data include a first set of data and a second set of data, wherein each of the first and second sets of data includes one or more data types specific to the application, wherein the transforming data includes transforming the one or more data types of the first data into one or more data types native to the GPU and transforming the one or more data types of the second data into one or more data types native to the GPU, wherein the sending includes sending to the first GPU core a communication including the one or more data types native to the first GPU and sending to the second GPU core a communication including the one or more data types native to the second GPU, and wherein the first GPU cores executes a set of instructions using the one or more data types native to the GPU in parallel with the second GPU core executing a set of instructions using the one or more data types native to the GPU. 13. A system for distributing a workload of an application to a graphics processing unit (GPU), comprising: a computing device coupled to a plurality of processing units, wherein the plurality of processing units includes a central processing unit (CPU) and a GPU, wherein each of the CPU and GPU executes a workload of an application; an annotation processor that identifies an indication that an instantiation of an object-oriented class included in an application is executable on a GPU, wherein a source code portion of the application includes the class and is executable on the GPU, and wherein the indication is included in the application; a scheduler that obtains an intermediate representation of a source code portion of the application, wherein the scheduler transforms data associated with the source code portion into one or more data types native to the GPU; and a compiler that compiles the intermediate representation into a set of instructions that is native to the GPU, wherein the set of instructions includes a binary representation of the source code portion executable on the GPU, and wherein execution of the set of instructions on the GPU includes processing a workload of the application, wherein the CPU sends to the GPU the set of instructions executable on the GPU and the one or more data types native to the GPU, and wherein the communication causes the GPU to execute the set of instructions using the one or more data types native to the GPU. 14. The system of claim 13 , wherein the application is written in a high-level programming language, and wherein the data associated with the source code portion includes first data including one or more data types specific to the high-level programming language. 15. The system of claim 14 , wherein the high-level programming language is Java, a data type specific to the high-level programming language is a Java primitive type, and the intermediate representation of the source code portion is Java bytecode. 16. The system of claim 13 , wherein before the application exe
Code distribution (considering CPU load at run-time G06F9/505; load rebalancing G06F9/5083) · CPC title
Offload · CPC title
considering hardware capabilities · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.