General purpose software parallel task engine

US9875138B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9875138-B2
Application numberUS-201615225362-A
CountryUS
Kind codeB2
Filing dateAug 1, 2016
Priority dateMar 14, 2006
Publication dateJan 23, 2018
Grant dateJan 23, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A software engine for decomposing work to be done into tasks, and distributing the tasks to multiple, independent CPUs for execution is described. The engine utilizes dynamic code generation, with run-time specialization of variables, to achieve high performance. Problems are decomposed according to methods that enhance parallel CPU operation, and provide better opportunities for specialization and optimization of dynamically generated code. A specific application of this engine, a software three dimensional (3D) graphical image renderer, is described.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a processor having multiple processing cores; and a parallel task engine for performing graphic processing tasks on data, the parallel task engine comprising: an input for receiving graphic processing tasks for rendering 3D objects; a scheduler for decomposing the tasks at run-time into one or more new tasks; and a run-time dynamic code generator for the new tasks, operation routines, the run-time dynamic code generator including a dynamic compiler, the dynamic compiler being adapted to output the operation routines for execution, wherein the scheduler further is for distributing and assigning the new tasks to multiple processing cores of the processor for performing in parallel the new tasks on at least a portion of the data by executing the dynamically compiled operation routines. 2. The system of claim 1 , wherein the 3D objects are defined as a collection of geometric primitives. 3. The system of claim 1 , wherein at least a portion of the scheduler operations of decomposing the tasks and the distributing and assigning the new tasks are dependent on an operating characteristic of the processor. 4. The system of claim 3 , wherein the operating characteristic of the processor is the number of processing cores of the processor. 5. The system of claim 1 , wherein the scheduler is configured to make run-time decomposition choices based on a quality of the operation routines generated by the dynamic compiler. 6. The system of claim 5 , wherein the quality of the operation routines is determined by performing one or more of: measuring characteristics of the operation routines, and obtaining statistics about the operation routines from the dynamic compiler. 7. The system of claim 1 , wherein the decomposing is dependent on at least one policy selected from a given set of policies, wherein the scheduler makes the selection of the at least one policy as a function of characteristics of the operation routines. 8. The system of claim 7 , wherein the scheduler selects the policy for decomposition which yields the highest estimated performance, based on an estimated performance of the operation routines. 9. The system of claim 7 , wherein the given set of policies includes: decomposing a task into one or more new tasks by partitioning the data on which the task is to be performed into one or more subsets of that data, each new task being responsible for performing the same operation as the original task on a corresponding data subset; decomposing a task into one or more new tasks, each of which performs a different operation than the original task, but which performs this operation on the same data set as the original task; and decomposing a task into one or more new tasks, by partitioning an individual datum of the data on which the task is to be performed, into sub-components, each new task creating one sub-component of each resulting datum for all the data. 10. The system of claim 1 , wherein the run-time dynamic code generator further comprises an optimizer, the optimizer taking as input an operation routine from the operation routines, or a pointer to an operation routine from the operation routines, the optimizer producing as output an output operation routine, or a pointer to the output operation routine, which is semantically equivalent to the operation routine at the input. 11. In a computer system having a processor, the processor having multiple processing cores, a method for performing graphics processing tasks on data, the method comprising: receiving graphics processing tasks for rendering 3D objects; decomposing the tasks at run-time into one or more new tasks; generating for the new tasks at run-time, operation routines, the generating including outputting the operation routines for execution using a dynamic compiler; distributing and assigning the new tasks to multiple processing cores of the processor; and the multiple processing cores performing the new tasks in parallel on at least part of the data by executing the operation routines. 12. The method of claim 11 , wherein the 3D objects are defined as a collection of geometric primitives. 13. The method of claim 11 , wherein the operating characteristic of the processor is the number of processing cores of the processor. 14. The method of claim 11 , wherein the decomposing is dependent on at least one policy selected from a given set of policies, the method further comprising making the selection of the at least one policy as a function of characteristics of an operating characteristic of the operation routines. 15. The method of claim 14 , further comprising selecting the policy for decomposition which yields the highest estimated performance, based on an estimated performance of the operation routines. 16. The method of claim 11 , wherein the decomposing the tasks is performed according to at least one of the following policies: decomposing a task into one or more new tasks by partitioning the data on which the task is to be performed into one or more subsets of that data, each new task being responsible for performing the same operation as the original task on a corresponding data subset; decomposing a task into one or more new tasks, each of which performs a different operation than the original task, but which performs this operation on the same data set as the original task; and decomposing a task into one or more new tasks, by partitioning an individual datum of the data on which the task is to be performed, into sub-components, each new task creating one sub-component of each resulting datum for all the data. 17. The method of claim 11 , wherein the 3D objects are defined as a collection of geometric primitives. 18. The method of claim 17 , wherein the graphics processing tasks include pixel processing tasks that draw the 3D objects to a rendered image, wherein the decomposing includes decomposing the pixel processing tasks into one or more new pixel processing tasks whereby at least two of the new pixel processing tasks contain fragments of non-overlapping regions in the rendered image, and the new pixel processing tasks are assigned to at least two job loops. 19. The method of claim 11 , wherein the 3D objects are defined as a collection of geometric primitives and wherein the decomposing is dependent on a policy of drawing primitives in a first-in, first-out order.

Assignees

Inventors

Classifications

  • General purpose rendering architectures · CPC title

  • Constructive solid geometry [CSG] using solid primitives, e.g. cylinders, cubes · CPC title

  • Program synchronisation; Mutual exclusion, e.g. by means of semaphores · CPC title

  • G06F9/4881Primary

    Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

  • controlled by a single instruction for multiple data lanes [SIMD] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9875138B2 cover?
A software engine for decomposing work to be done into tasks, and distributing the tasks to multiple, independent CPUs for execution is described. The engine utilizes dynamic code generation, with run-time specialization of variables, to achieve high performance. Problems are decomposed according to methods that enhance parallel CPU operation, and provide better opportunities for specialization…
Who is the assignee on this patent?
Google Inc, Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F9/4881. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 23 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).