Systems and methods for cross media reporting by fast merging of data sources
US-2022091873-A1 · Mar 24, 2022 · US
US11748264B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11748264-B1 |
| Application number | US-202217962904-A |
| Country | US |
| Kind code | B1 |
| Filing date | Oct 10, 2022 |
| Priority date | Apr 6, 2021 |
| Publication date | Sep 5, 2023 |
| Grant date | Sep 5, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Obtaining an approximate unique count for a column from a table from a database includes, generating, for a value from an unevaluated row, a hash value in a defined range of hash values, determining a cardinality of leading zeros in the hash value, identifying a bucket with respect to the hash value from a plurality of buckets corresponding to the defined range of hash values, wherein the buckets from the plurality of buckets correspond with respective non-overlapping portions of the defined range of hash values, such that the hash value is in the portion of the defined range of hash values corresponding to the bucket, and appending to an unsorted sparse representation a bucket identifier for the bucket and the cardinality of the leading zeros, and, in response to a determination that unevaluated rows are unavailable in the table, determining the approximate unique count using the unsorted sparse representation.
Opening claim text (preview).
What is claimed is: 1. A method comprising: obtaining an approximate unique count with respect to a column from a table from a database, wherein obtaining the approximate unique count includes: for an unevaluated row from the column: generating, for a value from the unevaluated row, a hash value in a defined range of hash values; determining a cardinality of leading zeros in the hash value; identifying a bucket with respect to the hash value, wherein identifying the bucket includes identifying the bucket from a plurality of buckets corresponding to the defined range of hash values, wherein the buckets from the plurality of buckets correspond with respective non-overlapping portions of the defined range of hash values, such that the hash value is in the portion of the defined range of hash values corresponding to the bucket; and appending to an unsorted sparse representation: a bucket identifier for the bucket; and the cardinality of the leading zeros; and in response to a determination that unevaluated rows are unavailable in the table, determining the approximate unique count using the unsorted sparse representation; and outputting the approximate unique count. 2. The method of claim 1 , wherein obtaining the approximate unique count includes: in response to a determination that utilization of a current memory allocation for the unsorted sparse representation is greater than a defined utilization threshold: obtaining an expanded memory allocation for the unsorted sparse representation such that the expanded memory allocation is a multiple of the current memory allocation; and storing the unsorted sparse representation in the expanded memory allocation. 3. The method of claim 1 , wherein obtaining the approximate unique count includes: in response to a determination that a current memory allocation for the unsorted sparse representation is greater than or equal to a conversion threshold: converting the unsorted sparse representation to a dense representation; and determining the approximate unique count by determining the approximate unique count using the dense representation. 4. The method of claim 1 , wherein the memory allocation for a bucket is one byte. 5. The method of claim 1 , wherein obtaining the approximate unique count includes: determining whether a map of the unsorted sparse representation includes the bucket identifier. 6. The method of claim 5 , wherein, in response to a determination that a map of the unsorted sparse representation includes the bucket identifier, obtaining the approximate unique count includes: omitting appending to the unsorted sparse representation; and updating a cardinality of leading zeros in the unsorted sparse representation in accordance with the map. 7. The method of claim 5 , wherein, in response to a determination that the hash value is absent from a map of the unsorted sparse representation, appending to the unsorted sparse representation includes: adding the bucket identifier to the map. 8. The method of claim 5 , wherein obtaining the approximate unique count includes: in response to a determination that a current memory allocation for the unsorted sparse representation and the map is greater than or equal to a conversion threshold: converting the unsorted sparse representation to a dense representation; and determining the approximate unique count by determining the approximate unique count using the dense representation. 9. The method of claim 1 , wherein: the table is partitioned into regions, wherein a respective region includes a respective non-overlapping set of rows from the table and is associated with a respective database instance, wherein obtaining the approximate unique count includes: obtaining a plurality of unsorted sparse representations using parallel processing, wherein a respective database instance obtains a respective unsorted sparse representation for a respective region; and in response to the determination that the table omits unevaluated rows, merging the unsorted sparse representations. 10. An apparatus comprising: a memory that stores instructions for obtaining an approximate unique count with respect to a column from a table from a database; and a processor that executes the instructions, wherein, to obtain the approximate unique count, the processor executes the instructions to: for an unevaluated row from the column: generate, for a value from the unevaluated row, a hash value in a defined range of hash values; determine a cardinality of leading zeros in the hash value; identify a bucket with respect to the hash value, wherein to identify the bucket the processor executes the instructions to identify the bucket from a plurality of buckets corresponding to the defined range of hash values, wherein the buckets from the plurality of buckets correspond with respective non-overlapping portions of the defined range of hash values, such that the hash value is in the portion of the defined range of hash values corresponding to the bucket; and append to an unsorted sparse representation: a bucket identifier for the bucket; and the cardinality of the leading zeros; and in response to a determination that unevaluated rows are unavailable in the table, determine the approximate unique count using the unsorted sparse representation; and output the approximate unique count. 11. The apparatus of claim 10 , wherein, to obtain the approximate unique count, the processor executes the instructions to: in response to a determination that utilization of a current memory allocation for the unsorted sparse representation is greater than a defined utilization threshold: obtain an expanded memory allocation for the unsorted sparse representation such that the expanded memory allocation is a multiple of the current memory allocation; and store the unsorted sparse representation in the expanded memory allocation. 12. The apparatus of claim 10 , wherein, to obtain the approximate unique count, the processor executes the instructions to: in response to a determination that a current memory allocation for the unsorted sparse representation is greater than or equal to a conversion threshold: convert the unsorted sparse representation to a dense representation; and use the dense representation to determine the approximate unique count. 13. The apparatus of claim 10 , wherein the memory allocation for a bucket is one byte. 14. The apparatus of claim 10 , wherein, to obtain the approximate unique count, the processor executes the instructions to: determine whether a map of the unsorted sparse representation includes the bucket identifier. 15. The apparatus of claim 14 , wherein, to obtain the approximate unique count, the processor executes the instructions to: in response to a determination that a map of the unsorted sparse representation includes the bucket identifier: omit appending to the unsorted sparse representation; and update a cardinality of leading zeros in the unsorted sparse representation in accordance with the map. 16. The apparatus of claim 14 , wherein, to append to the unsorted sparse representation, the processor executes the instructions to: in response to a determination that the hash value is absent from a map of the unsorted sparse representation, add the bucket identifier to the map. 17. The apparatus of claim 14 , wherein, to obtain the approximate unique count, the processor executes the instructions to: in response to a determination that a current memory allocation for the unsorted sparse representation and the m
using pseudo-associative means, e.g. set-associative or hashing · CPC title
the resource being the memory · CPC title
Free address space management · CPC title
hash tables · CPC title
using directory or table look-up (use of a directory or look-up table in file systems G06F16/13) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.