Dynamic fast percentiler
US-2018173785-A1 · Jun 21, 2018 · US
US10496731B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10496731-B2 |
| Application number | US-201916264581-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 31, 2019 |
| Priority date | Mar 1, 2011 |
| Publication date | Dec 3, 2019 |
| Grant date | Dec 3, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method, system, and processor-readable storage medium are directed towards calculating approximate order statistics on a collection of real numbers. In one embodiment, the collection of real numbers is processed to create a digest comprising hierarchy of buckets. Each bucket is assigned a real number N having P digits of precision and ordinality O. The hierarchy is defined by grouping buckets into levels, where each level contains all buckets of a given ordinality. Each individual bucket in the hierarchy defines a range of numbers—all numbers that, after being truncated to that bucket's P digits of precision, are equal to that bucket's N. Each bucket additionally maintains a count of how many numbers have fallen within that bucket's range. Approximate order statistics may then be calculated by traversing the hierarchy and performing an operation on some or all of the ranges and counts associated with each bucket.
Opening claim text (preview).
What is claimed: 1. A method, comprising: calculating an ordinality for a real number in a set of real numbers; using the calculated ordinality for the real number to increment a count, indicating that the real number was encountered, for a particular bucket in a digest comprised of one or more buckets stored in memory and organized as a hierarchical data structure of buckets, wherein each bucket of the one or more buckets is associated with at least an ordinality, a range of numerical values, and a count of the real numbers encountered within the range of numerical values; determining that a sum of the counts for the particular bucket, any sibling buckets, and a parent bucket satisfies a compression criteria; based on the determination that the sum of the counts satisfies the compression criteria, compressing the particular bucket and any sibling buckets into the parent bucket; wherein the method is performed by one or more computing devices. 2. The method of claim 1 , wherein the count for the particular bucket is weighted more than another count for a sibling bucket. 3. The method of claim 1 , wherein the count for the particular bucket is weighted more than another count for a sibling bucket based on a historical analysis of one or more digits associated with the particular bucket. 4. The method of claim 1 , wherein the count for the particular bucket is weighted less than another count for a sibling bucket. 5. The method of claim 1 , wherein the count for the particular bucket is weighted less another count for a sibling bucket based on a historical analysis of one or more digits associated with the particular bucket. 6. The method of claim 1 , wherein the compressing the particular bucket and any sibling buckets further comprises: creating an overall count by adding the counts from the particular bucket, any sibling buckets, and the parent bucket. 7. The method of claim 1 , wherein the compressing the particular bucket and any sibling buckets further comprises: creating an overall count by adding the counts from the particular bucket, any sibling buckets, and the parent bucket; and deleting the particular bucket and any sibling buckets. 8. The method of claim 1 , wherein the compressing the particular bucket and any sibling buckets further comprises: creating an overall count by adding the counts from the particular bucket, any sibling buckets, and the parent bucket together; assigning the overall count to the parent bucket; and deleting the particular bucket and any sibling buckets. 9. The method of claim 1 , wherein the one or more buckets in the digest are hierarchically organized into one or more levels. 10. The method of claim 1 , wherein the one or more buckets in the digest are hierarchically organized into one or more levels, wherein each level in the digest contains all buckets of a given ordinality. 11. The method of claim 1 , wherein the calculating an ordinality further comprises: calculating a representation for the real number in scientific notation, wherein the representation includes a mantissa and an exponent; and computing an ordinality for the real number by subtracting from the exponent a count of significant digits, including significant zeros, in the mantissa that appear to the right of any decimal point in the mantissa. 12. The method of claim 1 , wherein each bucket in the digest is also associated with a certain real number and a number of digits of precision, and wherein a given bucket in the digest includes all numbers that when truncated to the number of digits of precision equal the certain real number. 13. One or more non-transitory computer-readable storage media, storing instructions, which when executed by one or more processors cause performance of: calculating an ordinality for a real number in a set of real numbers; using the calculated ordinality for the real number to increment a count, indicating that the real number was encountered, for a particular bucket in a digest comprised of one or more buckets stored in memory and organized as a hierarchical data structure of buckets, wherein each bucket of the one or more buckets is associated with at least an ordinality, a range of numerical values, and a count of the real numbers encountered within the range of numerical values; determining that a sum of the counts for the particular bucket, any sibling buckets, and a parent bucket satisfies a compression criteria; based on the determination that the sum of the counts satisfies the compression criteria, compressing the particular bucket and any sibling buckets into the parent bucket. 14. The one or more non-transitory computer-readable storage media of claim 13 , wherein the count for the particular bucket is weighted more than another count for a sibling bucket. 15. The one or more non-transitory computer-readable storage media of claim 13 , wherein the count for the particular bucket is weighted more than another count for a sibling bucket based on a historical analysis of one or more digits associated with the particular bucket. 16. The one or more non-transitory computer-readable storage media of claim 13 , wherein the count for the particular bucket is weighted less than another count for a sibling bucket. 17. The one or more non-transitory computer-readable storage media of claim 13 , wherein the count for the particular bucket is weighted less another count for a sibling bucket based on a historical analysis of one or more digits associated with the particular bucket. 18. The one or more non-transitory computer-readable storage media of claim 13 , wherein the compressing the particular bucket and any sibling buckets further comprises: creating an overall count by adding the counts from the particular bucket, any sibling buckets, and the parent bucket; and deleting the particular bucket and any sibling buckets. 19. The one or more non-transitory computer-readable storage media of claim 13 , wherein the compressing the particular bucket and any sibling buckets further comprises: creating an overall count by adding the counts from the particular bucket, any sibling buckets, and the parent bucket together; assigning the overall count to the parent bucket; and deleting the particular bucket and any sibling buckets. 20. An apparatus, comprising: one or more processors; and a memory storing instructions, which when executed by the one or more processors, causes the one or more processors to: calculate an ordinality for a real number in a set of real numbers; using the calculated ordinality for the real number to increment a count, indicating that the real number was encountered, for a particular bucket in a digest comprised of one or more buckets stored in memory and organized as a hierarchical data structure of buckets, wherein each bucket of the one or more buckets is associated with at least an ordinality, a range of numerical values, and a count of the real numbers encountered within the range of numerical values; determine that a sum of the counts for the particular bucket, any sibling buckets, and a parent bucket satisfies a compression criteria; based on the determination that the sum of the counts satisfies the compression criteria, compress the particular bucket and any sibling buckets into the parent bucket.
with adaptive number of clusters · CPC title
for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
Approximate or statistical queries · CPC title
Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title
Arrangements for sorting or merging computer data on continuous record carriers, e.g. tape, drum, disc · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.