Efficient tail calculation to exploit data correlation
US-2016253671-A1 · Sep 1, 2016 · US
US10108670B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10108670-B2 |
| Application number | US-201514830387-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 19, 2015 |
| Priority date | Aug 19, 2015 |
| Publication date | Oct 23, 2018 |
| Grant date | Oct 23, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and systems for sorting a dataset include partitioning the dataset into 2npartitions, where n is a number of available processors. A first quicksort is performed in parallel across pairs of partitions based on a pivot using a plurality of processors. A second quicksort is performed in parallel on unsorted elements within each partition based on the pivot, where the unsorted elements were left unsorted by the first quicksort. Misplaced elements from a left side of the dataset are swapped with misplaced elements from a right side of the dataset to produce a left dataset that has elements equal to or lower than the pivot and a right dataset that has elements equal to or higher than the pivot.
Opening claim text (preview).
The invention claimed is: 1. A method for parallelizing a process of sorting a dataset to accelerate the process of sorting the dataset, comprising: partitioning the dataset into 2n partitions, where n is a number of available processors; assigning pairs of the partitions to respective ones of the available processors, wherein a given one of the pairs of the partitions includes a first partition corresponding to a first side of the dataset and a second partition corresponding to a second side of the dataset; performing a first quicksort in parallel across the pairs of partitions based on a pivot using the available processors, including swapping elements of the first partition larger than the pivot with elements of the second partition that are smaller than the pivot; performing a second quicksort in parallel across those partitions having unsorted elements based on the pivot, wherein the unsorted elements are those elements that were left unsorted by the first quicksort; adjusting the first and second sides of the dataset so that a number of first misplaced elements of the first side of the dataset is equal to a number of second misplaced element of the second side of the dataset; and swapping the first misplaced elements from the first side of the dataset with the second misplaced elements from the second side of the dataset to produce a first dataset that has elements equal to or lower than the pivot and a second dataset that has elements equal to or higher than the pivot. 2. The method of claim 1 , wherein the first quicksort performed on the given pair of partitions is halted when one partition of the given pair of partitions has been fully sorted. 3. The method of claim 1 , further comprising recursively repeating said steps of partitioning, performing a first quicksort, performing a second quicksort, and swapping on each of the first and second datasets. 4. The method of claim 3 , wherein the pairs partitions are assigned to the available processors in proportion to a number of data elements they include. 5. The method of claim 1 , wherein adjusting the first and second sides of the dataset further comprises changing a cutline between the first and second sides of the dataset without moving any data elements. 6. The method of claim 1 , wherein said steps of partitioning, performing the first quicksort, performing the second quicksort, and swapping misplaced elements are performed as part of an aggregate function in a parallel computing system. 7. The method of claim 1 , wherein adjusting the first and second sides of the dataset further comprises taking a difference between the first misplaced elements and the second misplaced elements, dividing the difference by two, and moving a number of misplaced elements resulting from the division to one of the first and second sides so that the number of first misplaced elements is equal to the number of second misplaced elements. 8. A computer readable storage medium comprising a computer readable program, wherein the computer readable program when executed on a computer causes the computer to perform a method for parallelizing a process of sorting a dataset to accelerate the process of sorting the dataset, the od comprising the steps of: partitioning the dataset into 2n partitions, where n is a number of available processors; assigning pairs of the partitions to respective ones of the available processors, wherein a given one of the pairs of the partitions includes a first partition corresponding to a first side of the dataset and a second partition corresponding to a second side of the dataset; performing a first quicksort in parallel across the pairs of partitions based on a pivot using the available processors, including swapping elements of the first partition larger than the pivot with elements of the second partition that are smaller than the pivot; performing a second quicksort in parallel across those partitions having unsorted elements based on the pivot, wherein the unsorted elements are those elements that were left unsorted by the first quicksort; adjusting the first and second sides of the dataset so that a number of first misplaced elements of the first side of the dataset is equal to a number of second misplaced elements of the second side of the dataset; and swapping the first misplaced elements from the first side of the dataset with the second misplaced elements from the second side of the dataset to produce a first dataset that has elements equal to or lower than the pivot and a second dataset that has elements equal to or higher than the pivot. 9. The computer program product of claim 8 , wherein the first quicksort performed on the given pair of partitions is halted when one partition of the given pair of partitions has been fully sorted. 10. The computer program product of claim 8 , further comprising recursively repeating said steps of partitioning, performing a first quicksort, performing a second quicksort, and swapping on each of the first and second datasets. 11. The computer program product of claim 8 , wherein the pairs of partitions are assigned to the available processors in proportion to a number of data elements they include. 12. The computer program product of claim 8 , wherein adjusting the first and second sides of the dataset further comprises changing a cutline between the first and second sides without moving any data elements. 13. The computer program product of claim 8 , wherein said steps of partitioning, performing the first quicksort, performing the second quicksort, and swapping misplaced elements are performed as part of an aggregate function in a parallel computing system. 14. A system for parallelizing a process of sorting a dataset to accelerating the process of sorting the dataset, comprising: a plurality of processors operatively coupled to a memory storing the dataset; and a sort control module configured to partition the dataset into 2n partitions, where n is a number of available processors, to assign pairs of the partitions to respective ones of the available processors, wherein a given one of the pairs of the partitions includes a first partition corresponding to a first side of the dataset and a second partition corresponding to a second side of the dataset, to trigger a performance of a first quicksort in parallel across the pairs of partitions based on a pivot, including swapping elements of the first partition larger than the pivot with elements of the second partition that are smaller than the pivot, to trigger a performance of a second quicksort in parallel across those partitions having unsorted elements based on the pivot, wherein the unsorted elements are those elements that were left unsorted by the first quicksort, to adjust the first and second sides of the dataset so that a number of first misplaced elements of the first side of the dataset is equal to a number of second misplaced elements of the second side of the dataset, and to trigger a swap of the first misplaced elements from the first side of the dataset with the second misplaced elements from the second side of the dataset to produce a first dataset that has elements equal to or lower than the pivot and a second dataset that has elements equal to or higher than the pivot. 15. The system of claim 14 , wherein the first quicksort performed on the given pair of partitions is halted when one partition of the given pair of partitions has been fully sorted. 16. The system of claim 14 , wherein the sort control module is further configured to recursively repeat the partitioning, the triggering of the first quicksort, the triggeri
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Sorting, i.e. extracting data from one or more carriers, rearranging the data in numerical or other ordered sequence, and rerecording the sorted data on the original carrier or on a different carrier or set of carriers {sorting methods in general}(G06F7/36 takes precedence) · CPC title
Merging, i.e. combining data contained in ordered sequence on at least two record carriers to produce a single carrier or set of carriers having all the original data in the ordered sequence {merging methods in general}(G06F7/36 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.