General purpose software parallel task engine

US9436451B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9436451-B2
Application numberUS-201514940350-A
CountryUS
Kind codeB2
Filing dateNov 13, 2015
Priority dateMar 14, 2006
Publication dateSep 6, 2016
Grant dateSep 6, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A software engine for decomposing work to be done into tasks, and distributing the tasks to multiple, independent CPUs for execution is described. The engine utilizes dynamic code generation, with run-time specialization of variables, to achieve high performance. Problems are decomposed according to methods that enhance parallel CPU operation, and provide better opportunities for specialization and optimization of dynamically generated code. A specific application of this engine, a software three dimensional (3D) graphical image renderer, is described.

First claim

Opening claim text (preview).

The invention claimed is: 1. In a computer system having a processor, the processor having multiple processing cores, a parallel task engine for performing tasks on data, the parallel task engine comprising: an input for receiving tasks; a scheduler for decomposing the tasks at run-time into one or more new tasks; and a run-time dynamic code generator for generating, for the new tasks, operation routines, the run-time dynamic code generator comprising a dynamic compiler, the dynamic compiler being adapted to output the operation routines for execution, wherein the scheduler further is for distributing and assigning the new tasks to multiple processing cores for performing in parallel the new tasks on at least a portion of the data by executing the dynamically compiled operation routines; and wherein at least a portion of the scheduler operations of decomposing the tasks and the distributing and assigning the new tasks are dependent on operating characteristics of the processor. 2. The parallel task engine of claim 1 , wherein both the decomposing the tasks and the distributing and assigning the new tasks are dependent on operating characteristics of the processor. 3. The parallel task engine of claim 2 , wherein the operating characteristic of the processor is the number of processing cores. 4. The parallel task engine of claim 1 , wherein the scheduler makes run-time decomposition choices based on a quality of the operation routines generated by the dynamic compiler. 5. The parallel task engine of claim 4 , wherein the quality of the operation routines is determined by performing one or more of: analysing the operation routines, measuring characteristics of the operation routines, and obtaining statistics about the operation routines from the dynamic compiler. 6. The parallel task engine of claim 1 , wherein the processor is a CPU. 7. The parallel task engine of claim 1 , wherein the decomposing is dependent on at least one policy selected from a given set of policies, wherein the scheduler makes the selection of the at least one policy as a function of characteristics of the operation routines. 8. The parallel task engine of claim 7 , wherein the scheduler selects the policy for decomposition which yields the highest estimated performance, based on an estimated performance of the operation routines. 9. The parallel task engine of claim 7 , wherein the given set of policies includes: decomposing a task into one or more new tasks by partitioning the data on which the task is to be performed into one or more subsets of that data, each new task being responsible for performing the same operation as the original task on a corresponding data subset; decomposing a task into one or more new tasks, each of which performs a different operation than the original task, but which performs this operation on the same data set as the original task; and decomposing a task into one or more new tasks, by partitioning an individual datum of the data on which the task is to be performed, into sub-components, each new task creating one sub-component of each resulting datum for all the data. 10. The parallel task engine of claim 1 , wherein the run-time dynamic code generator further comprises an optimizer, the optimizer taking as input an operation routine from the operation routines, or a pointer to an operation routine from the operation routines, the optimizer producing as output an output operation routine, or a pointer to the output operation routine, which is semantically equivalent to the operation routine at the input. 11. In a computer system having a processor, the processor having multiple processing cores, a method for performing tasks on data, the method comprising: receiving tasks; decomposing the tasks at run-time into one or more new tasks; generating for the new tasks at run-time, operation routines, the generating comprising outputting the operation routines for execution using a dynamic compiler; distributing and assigning the new tasks to multiple processing cores; and the multiple processing cores performing the new tasks in parallel on at least part of the data by executing the operation routines; wherein at least one of the decomposing the tasks and the distributing and assigning the new tasks are dependent on operating characteristics of the processor. 12. The method of claim 11 , wherein both the decomposing the tasks and the distributing and assigning the new tasks are dependent on operating characteristics of the processor. 13. The computer system of claim 11 , wherein the operating characteristic of the processor is the number of processing cores. 14. The method of claim 11 , wherein the processor is a CPU. 15. The method of claim 11 , wherein the decomposing is dependent on at least one policy selected from a given set of policies, the method further comprising making the selection of the at least one policy as a function of characteristics of the code. 16. The method of claim 15 , further comprising selecting the policy for decomposition which yields the highest estimated performance, based on an estimated performance of the operation routines. 17. The method of claim 11 , wherein the decomposing the tasks is performed according to at least one of the following policies: decomposing a task into one or more new tasks by partitioning the data on which the task is to be performed into one or more subsets of that data, each new task being responsible for performing the same operation as the original task on a corresponding data subset; decomposing a task into one or more new tasks, each of which performs a different operation than the original task, but which performs this operation on the same data set as the original task; and decomposing a task into one or more new tasks, by partitioning an individual datum of the data on which the task is to be performed, into sub-components, each new task creating one sub-component of each resulting datum for all the data. 18. The method of claim 11 , wherein the tasks comprise graphics processing tasks for 3D objects defined as a collection of geometric primitives, and wherein the decomposing comprises decomposing the graphics processing tasks into one or more new graphics processing tasks. 19. The method of claim 18 , further comprising pixel processing tasks which draw the 3D objects to a rendered image, wherein the decomposing comprises decomposing the pixel processing tasks into one or more new pixel processing tasks whereby at least two of the new pixel processing tasks contain fragments of non-overlapping regions in the rendered image, and the new pixel processing tasks are assigned to at least two job loops. 20. In a computer system having multiple processing cores, a method for performing tasks on data, the method comprising: decomposing the tasks at run-time to create new tasks; dynamically compiling code for the new tasks at run-time using a dynamic code generator comprising a dynamic compiler; distributing and assigning the new tasks to two or more processing cores for executing the dynamically compiled code, in parallel, for performing the new tasks on at least a portion of the data; wherein at least one of the decomposing the tasks and the distributing and assigning the new tasks are dependent on operating characteristics of the processor.

Assignees

Inventors

Classifications

  • Request control · CPC title

  • controlled by a single instruction for multiple data lanes [SIMD] · CPC title

  • considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title

  • Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • Loops · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9436451B2 cover?
A software engine for decomposing work to be done into tasks, and distributing the tasks to multiple, independent CPUs for execution is described. The engine utilizes dynamic code generation, with run-time specialization of variables, to achieve high performance. Problems are decomposed according to methods that enhance parallel CPU operation, and provide better opportunities for specialization…
Who is the assignee on this patent?
Transgaming Inc, Google Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/4881. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 06 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).