Efficient calculation and organization of approximate order statistics of real numbers

US10235345B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10235345-B2
Application numberUS-201715476899-A
CountryUS
Kind codeB2
Filing dateMar 31, 2017
Priority dateMar 1, 2011
Publication dateMar 19, 2019
Grant dateMar 19, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, system, and processor-readable storage medium are directed towards calculating approximate order statistics on a collection of real numbers. In one embodiment, the collection of real numbers is processed to create a digest comprising hierarchy of buckets. Each bucket is assigned a real number N having P digits of precision and ordinality O. The hierarchy is defined by grouping buckets into levels, where each level contains all buckets of a given ordinality. Each individual bucket in the hierarchy defines a range of numbers—all numbers that, after being truncated to that bucket's P digits of precision, are equal to that bucket's N. Each bucket additionally maintains a count of how many numbers have fallen within that bucket's range. Approximate order statistics may then be calculated by traversing the hierarchy and performing an operation on some or all of the ranges and counts associated with each bucket.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: calculating an ordinality for each real number in a set of real numbers; using the calculated ordinality for each real number to increment a count indicating that the real number was encountered into a digest comprising one or more non-overlapping buckets stored in memory and organized as a hierarchical data structure of buckets, wherein each bucket is associated with at least an ordinality, a range of numerical values, and a count of the real numbers encountered within the range of numerical values; calculating an order statistic based on a particular real number by efficiently traversing the hierarchical structure of the digest and using one or more counts associated with one or more buckets in the digest; and performing one or more actions based on the calculated order statistic; wherein the method is performed by one or more computing devices. 2. The method of claim 1 , wherein the one or more actions include turning on one or more devices. 3. The method of claim 1 , wherein the one or more actions include throttling traffic. 4. The method of claim 1 , wherein the one or more actions include alerting a user. 5. The method of claim 1 , further comprising: causing a user interface to display a representation of information related to the calculated order statistic. 6. The method of claim 1 , wherein the one or more buckets in the digest are hierarchically organized into one or more levels. 7. The method of claim 1 , wherein the one or more buckets in the digest are hierarchically organized into one or more levels, wherein each level in the digest contains all buckets of a given ordinality. 8. The method of claim 1 , further comprising: wherein the digest is hierarchically structured as a tree; and compressing the digest by collapsing one or more child buckets into an associated parent bucket if a sum of the counts of the one or more child buckets and the parent bucket falls below a threshold. 9. The method of claim 1 , further comprising: wherein the digest is hierarchically structured as a tree; compressing the digest by collapsing one or more child buckets into an associated parent bucket if a sum of the counts of the one or more child buckets and the parent bucket falls below a threshold; adding counts of the one or more child buckets to the count of the parent bucket; and deleting the one or more child buckets. 10. The method of claim 1 , further comprising: calculating a representation for the real number in scientific notation, wherein the representation includes as a mantissa and an exponent; and computing an ordinality for the real number by subtracting from the exponent a count of significant digits, including significant zeros, in the mantissa that appear to the right of any decimal point in the mantissa. 11. The method of claim 1 , wherein each bucket in the digest is also associated with a certain real number and a number of digits of precision, and wherein a given bucket in the digest includes all numbers that when truncated to the number of digits of precision equal the certain real number. 12. The method of claim 1 , wherein each bucket in the digest is represented by an ordinality and a range for that bucket. 13. The method of claim 1 , wherein the order statistic includes an approximate percentile. 14. A non-transitory computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method for computing an order statistic for a particular number in a set of numbers, the method comprising: calculating an ordinality for each real number in a set of real numbers; using the calculated ordinality for each real number to increment a count indicating that the real number was encountered into a digest comprising one or more non-overlapping buckets stored in memory and organized as a hierarchical data structure of buckets, wherein each bucket is associated with at least an ordinality, a range of numerical values, and a count of the real numbers encountered within the range of numerical values; calculating an order statistic based on a particular real number by efficiently traversing the hierarchical structure of the digest and using one or more counts associated with one or more buckets in the digest; and performing one or more actions based on the calculated order statistic. 15. The non-transitory computer-readable storage medium of claim 14 , wherein the one or more actions include turning on one or more devices. 16. The non-transitory computer-readable storage medium of claim 14 , wherein the one or more actions include throttling traffic. 17. The non-transitory computer-readable storage medium of claim 14 , wherein the one or more actions include alerting a user. 18. The non-transitory computer-readable storage medium of claim 14 , further comprising: causing a user interface to display a representation of information related to the calculated order statistic. 19. An apparatus that computes an order statistic for a particular number in a set of numbers, comprising: a computing device comprising a processor and a memory, wherein the computing device is configured to: calculating an ordinality for each real number in a set of real numbers; using the calculated ordinality for each real number to increment a count indicating that the real number was encountered into a digest comprising one or more non-overlapping buckets stored in memory and organized as a hierarchical data structure of buckets, wherein each bucket is associated with at least an ordinality, a range of numerical values, and a count of the real numbers encountered within the range of numerical values; calculating an order statistic based on a particular real number by efficiently traversing the hierarchical structure of the digest and using one or more counts associated with one or more buckets in the digest; and performing one or more actions based on the calculated order statistic. 20. The apparatus of claim 19 , wherein the one or more actions include any combination of: turning on one or more devices, throttling traffic, or alerting a user.

Assignees

Inventors

Classifications

  • with adaptive number of clusters · CPC title

  • Approximate or statistical queries · CPC title

  • Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title

  • G06F17/18Primary

    for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title

  • Arrangements for sorting or merging computer data on continuous record carriers, e.g. tape, drum, disc · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10235345B2 cover?
A method, system, and processor-readable storage medium are directed towards calculating approximate order statistics on a collection of real numbers. In one embodiment, the collection of real numbers is processed to create a digest comprising hierarchy of buckets. Each bucket is assigned a real number N having P digits of precision and ordinality O. The hierarchy is defined by grouping buckets…
Who is the assignee on this patent?
Splunk Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/18. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 19 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).