Parallel processing in plural processors with result register each performing associative operation on respective column data

US9760372B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9760372-B2
Application numberUS-201113224090-A
CountryUS
Kind codeB2
Filing dateSep 1, 2011
Priority dateSep 1, 2011
Publication dateSep 12, 2017
Grant dateSep 12, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for combining data values through associative operations. The method includes, with a processor, arranging any number of data values into a plurality of columns according to natural parallelism of the associative operations and reading each column to a register of an individual processor. The processors are directed to combine the data values in the columns in parallel using a first associative operation. The results of the first associative operation for each column are stored in a register of each processor.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for combining data values through associative operations, the method comprising: arranging data values into a plurality of columns according to natural parallelism of the associative operations to form a first data matrix; reading each column to a register of a respective one of a number of processors; directing each of the processors to execute a respective first one of the associative operations, in parallel, on the data values in their respective columns to which the processors are assigned; storing the respective results of the first associative operation for each respective column to a respective register of each processor; arranging the plurality of columns into a plurality of two dimensional matrices; and arranging the two dimensional matrices into a three dimensional matrix. 2. The method of claim 1 , in which arranging data values into a plurality of columns comprises transposing rows in a second data matrix into columns in the data matrix. 3. The method of claim 1 , further comprising storing the results of the respective first associative operation to the memory of the computing device. 4. The method of claim 1 , further comprising: arranging the results of the first associative operation into a number of primary results columns; reading the data values in each of the individual primary results columns into registers of one of the plurality of processors; and directing each of the processors to combine the data values in the primary results column to produce a primary final result in parallel. 5. The method of claim 4 , further comprising arranging the primary results columns into a two dimensional results matrix. 6. A method for combining data values through associative operations, the method comprising: arranging data values into a plurality of columns according to natural parallelism of the associative operations to form a first data matrix; reading each column to a register of a respective one of a number of processors; directing each of the processors to execute a respective first one of the associative operations, in parallel, on the data values in their respective columns to which the processors are assigned; storing the respective results of the first associative operation for each respective column to a respective register of each processor; in which the number of columns is greater than the number of processors, the method further comprising: associating the results of the first associative operation with corresponding columns; identifying unprocessed columns which are not associated with a result; assigning one processor to each unprocessed column; and directing the processors to combine the data values in the unprocessed columns through the first associative operation. 7. A method for combining data values through associative operations, the method comprising: arranging data values into a plurality of columns according to natural parallelism of the associative operations to form a first data matrix; reading each column to a register of a respective one of a number of processors; directing each of the processors to execute a respective first one of the associative operations, in parallel, on the data values in their respective columns to which the processors are assigned; storing the respective results of the first associative operation for each respective column to a respective register of each processor; arranging the results of the first associative operation into a number of primary results columns; reading the data values in each of the individual primary results columns into registers of one of the plurality of processors; directing each of the processors to combine the data values in the primary results column to produce a primary final result in parallel; arranging the primary results columns into a two dimensional results matrix; wherein the number of columns in the two dimensional results matrix is greater than the number of processors, the method further comprising: assigning a processor to each unprocessed column; and directing the assigned processor to combine data values in the unprocessed column to produce a secondary result. 8. The method of claim 7 , further comprising combining secondary results from the columns of the two dimensional results matrix by performing the respective first associative operation. 9. The method of claim 7 , further comprising combining secondary results from the columns of the two dimensional results matrix by performing a prefix scan. 10. The method of claim 7 , further comprising combining secondary results from the columns of the two dimensional results matrix by performing a parallel reduction. 11. A method for combining data values through associative operations, the method comprising: arranging data into a data matrix, having a number of columns, according to natural parallelism of the associative operations, the data matrix being stored in a memory device; assigning each of a plurality of processors to a subset of the data stored in a contiguous memory location; directing the plurality of processors to execute a number of respective associative operations on data values in the subsets to produce intermediate results; writing the intermediate respective results to registers in the processors; and producing a final result from the intermediate results, the final result being stored in a memory of a computing device; in which the number of columns is greater than a number of the processors, the method further comprising: associating the intermediate results with corresponding columns; identifying unprocessed columns which are not associated with an intermediate result; assigning one processor to each unprocessed column; and directing the processors to combine data values in the unprocessed columns through the number of associative operations. 12. The method of claim 11 , in which the subset of data stored in a contiguous memory location is a column within the data matrix. 13. The method of claim 11 , in which the associative operations are performed on data values in the column without writing to memory, reading from memory, or synchronization to produce the intermediate results. 14. The method of claim 11 , in which the subset of data stored in a contiguous memory location is written to a register of the assigned processor. 15. The method of claim 11 , in which the data matrix is a two dimensional matrix. 16. The method of claim 11 , in which the data matrix is a three dimensional matrix. 17. The method of claim 16 , in which the intermediate results comprise a two dimensional matrix. 18. The method of claim 17 , in which the final result is a one dimensional matrix. 19. The method of claim 16 , in which a number of columns in one of the data matrix or the intermediate results is greater than the number of processors, the method further comprising: assigning a processor to each unprocessed column; and directing the assigned processor to combine data values in the unprocessed column according to the associative operations.

Assignees

Inventors

Classifications

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs (mappping at compile time, see G06F8/451) · CPC title

  • Arithmetic instructions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9760372B2 cover?
A method for combining data values through associative operations. The method includes, with a processor, arranging any number of data values into a plurality of columns according to natural parallelism of the associative operations and reading each column to a register of an individual processor. The processors are directed to combine the data values in the columns in parallel using a first as…
Who is the assignee on this patent?
Wu Ren, Zhang Bin, Hsu Meichun, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06F9/30036. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 12 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).