Tunable hardware sort engine for performing composite sorting algorithms
US-9710503-B2 · Jul 18, 2017 · US
US2016378801A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016378801-A1 |
| Application number | US-201514750385-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 25, 2015 |
| Priority date | Jun 25, 2015 |
| Publication date | Dec 29, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are disclosed for sorting an input data set. A sort tool determines a distribution of values of a data set that includes a plurality of data records. The sort tool partitions the data set into a plurality of subsets based on the distribution. Each of the data records is inserted into one of the subsets based on a corresponding sort value of the data record. The sort tool identifies one or more of the subsets that contain at least two distinct sort values. In each of the identified subsets, the data records are sorted by a corresponding sort value of the data record.
Opening claim text (preview).
1 - 7 . (canceled) 8 . A computer program product, comprising: a computer-readable storage medium having computer-readable program code embodied therewith, the computer-readable program code configured to perform an operation comprising: determining a distribution of values of a data set that includes a plurality of data records, partitioning the data set into a plurality of subsets based on the distribution, wherein each of the data records is inserted into one of the subsets based on a corresponding sort value of the data record, identifying one or more of the subsets that contain at least two distinct sort values, and sorting, in each of the identified subsets, the data records by a corresponding sort value of the data record. 9 . The computer program product of claim 8 , wherein the operation further comprises: merging the plurality of subsets. 10 . The computer program product of claim 8 , wherein determining the distribution of values of the data set comprises: sampling the data set; and generating an equi-depth histogram of the plurality of data records, wherein the equi-depth histogram determines the distribution of the values of the data records, and wherein the equi-depth histogram determines an amount of the subsets. 11 . The computer program product of claim 8 , wherein partitioning the data records into the plurality of subsets based on the distribution comprises, for each subset: initializing a single value flag of the subset as true, wherein the single value flag set as true indicates that the corresponding sort values of the data records of the subset do not differ from one another; and for each data record inserted in the subset: evaluating the data record to determine whether a corresponding value of the data record is equal to a start key value of the subset, wherein the start key value is a corresponding value of an data record initially inserted into the subset. 12 . The computer program product of claim 11 , wherein identifying the one or more of the subsets comprises, for each data record inserted in the subset: upon determining that the corresponding value of the data record is not equal to the start key value, setting the single value flag of the subset as false. 13 . The computer program product of claim 8 , wherein the operation further comprises: partitioning the data records of the subset into a majority partition and a minority partition, wherein the majority partition contains the data records of the subset that correspond to a single sort value and wherein the minority partition contains the data records of the subset having a sort value that is not equal to the single sort value; and identifying the majority partition as having a single sort value. 14 . The computer program product of claim 8 , wherein a set for the sampling is determined based on database statistics. 15 . A system, comprising: a processor; and a memory storing a program, which, when executed on the processor, performs an operation comprising: determining a distribution of values of a data set that includes a plurality of data records, partitioning the data set into a plurality of subsets based on the distribution, wherein each of the data records is inserted into one of the subsets based on a corresponding sort value of the data record, identifying one or more of the subsets that contain at least two distinct sort values, and sorting, in each of the identified subsets, the data records by a corresponding sort value of the data record. 16 . The system of claim 15 , wherein the operation further comprises: merging the plurality of subsets. 17 . The system of claim 15 , wherein determining the distribution of values of the data set comprises: sampling the data set; and generating an equi-depth histogram of the plurality of data records, wherein the equi-depth histogram determines the distribution of the values of the data records, and wherein the equi-depth histogram determines an amount of the subsets. 18 . The computer program product of claim 8 , wherein partitioning the data records into the plurality of subsets based on the distribution comprises, for each subset: initializing a single value flag of the subset as true, wherein the single value flag set as true indicates that the corresponding sort values of the data records of the subset do not differ from one another; and for each data record inserted in the subset: evaluating the data record to determine whether a corresponding value of the data record is equal to a start key value of the subset, wherein the start key value is a corresponding value of an data record initially inserted into the subset. 19 . The system of claim 18 , wherein identifying the one or more of the subsets comprises, for each data record inserted in the subset: upon determining that the corresponding value of the data record is not equal to the start key value, setting the single value flag of the subset as false. 20 . The system of claim 15 , wherein the operation further comprises: partitioning the data records of the subset into a majority partition and a minority partition, wherein the majority partition contains the data records of the subset that correspond to a single sort value and wherein the minority partition contains the data records of the subset having a sort value that is not equal to the single sort value; and identifying the majority partition as having a single sort value.
Arrangements for sorting or merging computer data on continuous record carriers, e.g. tape, drum, disc · CPC title
Sorting, i.e. extracting data from one or more carriers, rearranging the data in numerical or other ordered sequence, and rerecording the sorted data on the original carrier or on a different carrier or set of carriers {sorting methods in general}(G06F7/36 takes precedence) · CPC title
Database migration support · CPC title
using data annotations, e.g. user-defined metadata · CPC title
Large Object storage; Management thereof · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.