General purpose software parallel task engine

US9477452B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9477452-B2
Application numberUS-201514631618-A
CountryUS
Kind codeB2
Filing dateFeb 25, 2015
Priority dateMar 14, 2006
Publication dateOct 25, 2016
Grant dateOct 25, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A software engine for decomposing work to be done into tasks, and distributing the tasks to multiple, independent CPUs for execution is described. The engine utilizes dynamic code generation, with run-time specialization of variables, to achieve high performance. Problems are decomposed according to methods that enhance parallel CPU operation, and provide better opportunities for specialization and optimization of dynamically generated code. A specific application of this engine, a software three dimensional (3D) graphical image renderer, is described.

First claim

Opening claim text (preview).

The invention claimed is: 1. In a computer system, a parallel task engine for performing tasks on data, the parallel task engine comprising: an input for receiving tasks, each task for performing an operation; a scheduler for decomposing the tasks into one or more new tasks, the decomposing being dependent on at least one policy selected from a given set of policies; a run-time dynamic code generator for generating, for the new tasks, operation routines, the run-time dynamic code generator comprising a dynamic compiler, the dynamic compiler being adapted to output the operation routines for execution; a set of job loops, at least one job loop of the set of job loops for performing the new tasks on at least part of the data by executing the operation routines, the set of job loops running in parallel on two or more CPUs; the scheduler for distributing and assigning the new tasks to the at least one job loop of the set of job loops; and the scheduler for making the selection of the at least one policy based on general heuristics. 2. The parallel task engine of claim 1 , wherein the given set of policies include one or more of: by-domain policies, and by-component policies. 3. The parallel task engine of claim 2 , wherein the scheduler performs by-domain decomposition on a given task by modifying data pointers or parameters of the given task. 4. The parallel task engine of claim 2 , wherein the scheduler performs by-component decomposition on a given task having a full datum operation, by dividing the full datum operation into a plurality of component operations, and determines whether the component operations must be applied sequentially, in parallel or in an arbitrary order. 5. The parallel task engine of claim 1 , wherein the scheduler decomposes the tasks into new independent tasks to be performed in parallel on two or more CPUs. 6. The parallel task engine of claim 5 , wherein the new independent tasks associated with a given task have different parameters and data pointers than the given task and perform the same operation associated with the given task. 7. The parallel task engine of claim 1 , further comprising a cache for retaining and retrieving the operation routines, wherein the cache comprises a directory of cache entries, each entry comprising: an operation routine pointer, and a tag to be matched when searching the cache for operation routines, the tag consisting of an operation routine identifier and a context, the context having a collection of variables, or a pointer to a collection of variables. 8. The parallel task engine of claim 7 , wherein the run-time dynamic code generator further comprises an optimizer, the optimizer taking as input an operation routine from the operation routines, or a pointer to an operation routine from the operation routines, the optimizer producing as output an output operation routine, or a pointer to the output operation routine, which is semantically equivalent to the operation routine at the input. 9. The parallel task engine of claim 1 , wherein the tasks comprise graphics processing tasks for 3D objects defined as a collection of geometric primitives, and further wherein the scheduler is for decomposing the graphics processing tasks into one or more new graphics processing tasks. 10. The parallel task engine of claim 9 , further comprising a rasterization module for identifying pixel fragments covered by the primitives, the rasterization module being configured for: assembling vertices of each of the primitives; constructing a polygon covering the pixel fragments; scan converting the polygon to obtain coordinates of the scan converted polygon; and storing the coordinates in an outline buffer; the coordinates being used in the identifying of the pixel fragments covered by the primitives. 11. The parallel task engine of claim 10 , wherein the rasterization module is further configured for clipping the polygon against a visible region prior to the scan converting. 12. In a computer system, a method for performing tasks on data, the method comprising: receiving tasks; decomposing the tasks into one or more new tasks, the decomposing being dependent on at least one policy selected from a given set of policies; generating, for the new tasks, operation routines, the generating comprising outputting the operation routines for execution using a dynamic compiler; making the selection of the at least one policy based on general heuristics; providing a set of job loops; distributing and assigning the new tasks to at least one job loop of the set of job loops; running the set of job loops in parallel on two or more CPUs; and the at least one job loop of the set of job loops performing the new tasks on at least part of the data by executing the operation routines. 13. The method of claim 12 , wherein the generating of operation routines uses a context having a collection of variables, or a pointer to a collection of variables, specifying at least one of options, parameters, conditions, constant data, and other data apart from the data on which tasks are performed, the context for influencing the tasks performed on the data. 14. The method of claim 12 , wherein the tasks comprise a matrix multiplication of a vector by a matrix, the matrix comprised in the context. 15. The method of claim 12 , wherein tasks and new tasks comprise: a command, comprising: a name having a numeric or symbolic identifier, or a pointer to such an identifier, the name defining an abstract operation to be performed on data; zero, one, or more parameters; and one or more pointers or parameters, identifying the data on which the command is to be performed. 16. The method of claim 12 , further comprising, producing, from an input operation routine from the operation routines, or a pointer to an operation routine from the operation routines, an output operation routine, or a pointer to the output operation routine, which is semantically equivalent to the input operation routine. 17. The method of claim 16 , wherein the generating of operation routines uses a context and wherein the production of an output operation routine, or a pointer to the output operation routine, results in an output operation routine which is equivalent to the input operation routine under limitations or conditions described by the context. 18. The method of claim 12 , wherein the decomposing the tasks is performed according to at least one of the following policies: decomposing a task into one or more new tasks by partitioning the data on which the task is to be performed into one or more subsets of that data, each new task being responsible for performing the same operation as the original task on a corresponding data subset; decomposing a task into one or more new tasks, each of which performs a different operation than the original task, but which performs this operation on the same data as the original task; and decomposing a task into one or more new tasks, by partitioning an individual datum of the data on which the task is to be performed, into sub-components, each new task creating one sub-component of each resulting datum for all the data. 19. The method of claim 12 , further comprising estimating, for an operation routine from the operation routines, a performance, the operation routine comprising instruction code, the estimating comprising at least one of analyzing, inspecting and measuring characteristics of the operation routine instruction code. 20. The method of claim 19 , further c

Assignees

Inventors

Classifications

  • Indexing scheme for editing of 3D models · CPC title

  • Loops · CPC title

  • Constructive solid geometry [CSG] using solid primitives, e.g. cylinders, cubes · CPC title

  • considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title

  • G06F9/4881Primary

    Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9477452B2 cover?
A software engine for decomposing work to be done into tasks, and distributing the tasks to multiple, independent CPUs for execution is described. The engine utilizes dynamic code generation, with run-time specialization of variables, to achieve high performance. Problems are decomposed according to methods that enhance parallel CPU operation, and provide better opportunities for specialization…
Who is the assignee on this patent?
Google Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/4881. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 25 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).