Merging and sorting arrays on an SIMD processor

US9740659B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9740659-B2
Application numberUS-201414219391-A
CountryUS
Kind codeB2
Filing dateMar 19, 2014
Priority dateMar 19, 2014
Publication dateAug 22, 2017
Grant dateAug 22, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and articles of manufacture for merging and sorting arrays on a processor are provided herein. A method includes splitting an input array into multiple sub-arrays across multiple processing elements; merging the multiple sub-arrays into multiple vectors; and sorting the multiple vectors by comparing and swapping one or more vector elements among the multiple vectors.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: splitting an input array into multiple sub-arrays across multiple processing elements; merging the multiple sub-arrays into multiple vectors, wherein each of the multiple vectors contains one element from each of the multiple sub-arrays; sorting the multiple vectors by comparing and swapping one or more vector elements among the multiple vectors, wherein said comparing and said swapping comprises performing a set of compare-swap instructions on a first set of one or more registers, and wherein the number of compare-swap instructions in the set of compare-swap instructions is represented as NlogN+1, wherein N is a predetermined value; and writing-back results of said sorting on a second set of one or more registers; wherein said splitting, said merging, said sorting, and said writing-back are carried out by at least one computing device. 2. The method of claim 1 , wherein said sorting comprises sorting the multiple vectors via an odd-even network. 3. The method of claim 2 , comprising: applying the odd-even network directly across the multiple vectors. 4. The method of claim 1 , wherein said sorting comprises sorting the multiple vectors via a bitonic network. 5. The method of claim 4 , comprising: applying the bitonic network directly across the multiple vectors. 6. The method of claim 1 , comprising: selecting a sub-array size. 7. The method of claim 6 , wherein said splitting comprises splitting the input array into multiple sub-arrays of the selected sub-array size. 8. The method of claim 6 , wherein said selecting comprises selecting the sub-array size based on a latency value. 9. The method of claim 6 , wherein said selecting comprises selecting the sub-array size based on a capacity to hide vector load latency. 10. The method of claim 1 , wherein each of said multiple processing elements comprises a given width. 11. The method of claim 1 , comprising: generating the set of compare-swap instructions for said sorting. 12. A computer program product, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computing device to cause the computing device to: split an input array into multiple sub-arrays across multiple processing elements; merge the multiple sub-arrays into multiple vectors, wherein each of the multiple vectors contains one element from each of the multiple sub-arrays; sort the multiple vectors by comparing and swapping one or more vector elements among the multiple vectors, wherein said comparing and said swapping comprises performing a set of compare-swap instructions on a first set of one or more registers, and wherein the number of compare-swap instructions in the set of compare-swap instructions is represented as N log N+1, wherein N is a predetermined value; and write-back results of said sorting on a second set of one or more registers. 13. The computer program product of claim 12 , wherein said sorting comprises sorting the multiple vectors via an odd-even network or a bitonic network. 14. The computer program product of claim 12 , wherein the program instructions further cause the computing device to: select a sub-array size. 15. A system comprising: a memory; and at least one processor coupled to the memory and configured for: splitting an input array into multiple sub-arrays across multiple processing elements; merging the multiple sub-arrays into multiple vectors, wherein each of the multiple vectors contains one element from each of the multiple sub-arrays; sorting the multiple vectors by comparing and swapping one or more vector elements among the multiple vectors, wherein said comparing and said swapping comprises performing a set of compare-swap instructions on a first set of one or more registers, and wherein the number of compare-swap instructions in the set of compare-swap instructions is represented as N log N+1, wherein N is a predetermined value; and writing-back results of said sorting on a second set of one or more registers. 16. A method comprising: selecting a desired sub-array size based on one or more parameters; splitting an input array into multiple sub-arrays of the selected size across multiple simple instruction, multiple data (SIMD) lanes; merging the multiple simple instruction, multiple data (SIMD) lanes into multiple vectors, wherein each of the multiple vectors contains one element from each of the multiple sub-arrays; sorting the multiple vectors by performing a set of compare-swap instructions, applied to the multiple vectors, on a first set of one or more registers, wherein the number of compare-swap instructions in the set of compare-swap instructions is represented as N log N+1, wherein N is a predetermined value; and writing-back results of said sorting on a second set of one or more registers; wherein said selecting, said splitting, said merging, said sorting, and said writing-back are carried out by at least one computing device. 17. The method of claim 16 , wherein said sorting comprises sorting the multiple vectors via an odd-even network. 18. The method of claim 17 , comprising: applying the odd-even network directly across the multiple vectors. 19. The method of claim 16 , wherein said sorting comprises sorting the multiple vectors via a bitonic network. 20. The method of claim 19 , comprising: applying the bitonic network directly across the multiple vectors.

Assignees

Inventors

Classifications

  • Vector processors · CPC title

  • One dimensional arrays, e.g. rings, linear arrays, buses · CPC title

  • Array of vector units · CPC title

  • Compare instructions, e.g. Greater-Than, Equal-To, MINMAX · CPC title

  • Combined merging and sorting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9740659B2 cover?
Methods, systems, and articles of manufacture for merging and sorting arrays on a processor are provided herein. A method includes splitting an input array into multiple sub-arrays across multiple processing elements; merging the multiple sub-arrays into multiple vectors; and sorting the multiple vectors by comparing and swapping one or more vector elements among the multiple vectors.
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F15/8015. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 22 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).