What technology area does this patent fall under?

Primary CPC classification G06F12/0864. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 05 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Approximate unique count

US11748264B1 · US · B1

Patent metadata
Field	Value
Publication number	US-11748264-B1
Application number	US-202217962904-A
Country	US
Kind code	B1
Filing date	Oct 10, 2022
Priority date	Apr 6, 2021
Publication date	Sep 5, 2023
Grant date	Sep 5, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Obtaining an approximate unique count for a column from a table from a database includes, generating, for a value from an unevaluated row, a hash value in a defined range of hash values, determining a cardinality of leading zeros in the hash value, identifying a bucket with respect to the hash value from a plurality of buckets corresponding to the defined range of hash values, wherein the buckets from the plurality of buckets correspond with respective non-overlapping portions of the defined range of hash values, such that the hash value is in the portion of the defined range of hash values corresponding to the bucket, and appending to an unsorted sparse representation a bucket identifier for the bucket and the cardinality of the leading zeros, and, in response to a determination that unevaluated rows are unavailable in the table, determining the approximate unique count using the unsorted sparse representation.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: obtaining an approximate unique count with respect to a column from a table from a database, wherein obtaining the approximate unique count includes: for an unevaluated row from the column: generating, for a value from the unevaluated row, a hash value in a defined range of hash values; determining a cardinality of leading zeros in the hash value; identifying a bucket with respect to the hash value, wherein identifying the bucket includes identifying the bucket from a plurality of buckets corresponding to the defined range of hash values, wherein the buckets from the plurality of buckets correspond with respective non-overlapping portions of the defined range of hash values, such that the hash value is in the portion of the defined range of hash values corresponding to the bucket; and appending to an unsorted sparse representation: a bucket identifier for the bucket; and the cardinality of the leading zeros; and in response to a determination that unevaluated rows are unavailable in the table, determining the approximate unique count using the unsorted sparse representation; and outputting the approximate unique count. 2. The method of claim 1 , wherein obtaining the approximate unique count includes: in response to a determination that utilization of a current memory allocation for the unsorted sparse representation is greater than a defined utilization threshold: obtaining an expanded memory allocation for the unsorted sparse representation such that the expanded memory allocation is a multiple of the current memory allocation; and storing the unsorted sparse representation in the expanded memory allocation. 3. The method of claim 1 , wherein obtaining the approximate unique count includes: in response to a determination that a current memory allocation for the unsorted sparse representation is greater than or equal to a conversion threshold: converting the unsorted sparse representation to a dense representation; and determining the approximate unique count by determining the approximate unique count using the dense representation. 4. The method of claim 1 , wherein the memory allocation for a bucket is one byte. 5. The method of claim 1 , wherein obtaining the approximate unique count includes: determining whether a map of the unsorted sparse representation includes the bucket identifier. 6. The method of claim 5 , wherein, in response to a determination that a map of the unsorted sparse representation includes the bucket identifier, obtaining the approximate unique count includes: omitting appending to the unsorted sparse representation; and updating a cardinality of leading zeros in the unsorted sparse representation in accordance with the map. 7. The method of claim 5 , wherein, in response to a determination that the hash value is absent from a map of the unsorted sparse representation, appending to the unsorted sparse representation includes: adding the bucket identifier to the map. 8. The method of claim 5 , wherein obtaining the approximate unique count includes: in response to a determination that a current memory allocation for the unsorted sparse representation and the map is greater than or equal to a conversion threshold: converting the unsorted sparse representation to a dense representation; and determining the approximate unique count by determining the approximate unique count using the dense representation. 9. The method of claim 1 , wherein: the table is partitioned into regions, wherein a respective region includes a respective non-overlapping set of rows from the table and is associated with a respective database instance, wherein obtaining the approximate unique count includes: obtaining a plurality of unsorted sparse representations using parallel processing, wherein a respective database instance obtains a respective unsorted sparse representation for a respective region; and in response to the determination that the table omits unevaluated rows, merging the unsorted sparse representations. 10. An apparatus comprising: a memory that stores instructions for obtaining an approximate unique count with respect to a column from a table from a database; and a processor that executes the instructions, wherein, to obtain the approximate unique count, the processor executes the instructions to: for an unevaluated row from the column: generate, for a value from the unevaluated row, a hash value in a defined range of hash values; determine a cardinality of leading zeros in the hash value; identify a bucket with respect to the hash value, wherein to identify the bucket the processor executes the instructions to identify the bucket from a plurality of buckets corresponding to the defined range of hash values, wherein the buckets from the plurality of buckets correspond with respective non-overlapping portions of the defined range of hash values, such that the hash value is in the portion of the defined range of hash values corresponding to the bucket; and append to an unsorted sparse representation: a bucket identifier for the bucket; and the cardinality of the leading zeros; and in response to a determination that unevaluated rows are unavailable in the table, determine the approximate unique count using the unsorted sparse representation; and output the approximate unique count. 11. The apparatus of claim 10 , wherein, to obtain the approximate unique count, the processor executes the instructions to: in response to a determination that utilization of a current memory allocation for the unsorted sparse representation is greater than a defined utilization threshold: obtain an expanded memory allocation for the unsorted sparse representation such that the expanded memory allocation is a multiple of the current memory allocation; and store the unsorted sparse representation in the expanded memory allocation. 12. The apparatus of claim 10 , wherein, to obtain the approximate unique count, the processor executes the instructions to: in response to a determination that a current memory allocation for the unsorted sparse representation is greater than or equal to a conversion threshold: convert the unsorted sparse representation to a dense representation; and use the dense representation to determine the approximate unique count. 13. The apparatus of claim 10 , wherein the memory allocation for a bucket is one byte. 14. The apparatus of claim 10 , wherein, to obtain the approximate unique count, the processor executes the instructions to: determine whether a map of the unsorted sparse representation includes the bucket identifier. 15. The apparatus of claim 14 , wherein, to obtain the approximate unique count, the processor executes the instructions to: in response to a determination that a map of the unsorted sparse representation includes the bucket identifier: omit appending to the unsorted sparse representation; and update a cardinality of leading zeros in the unsorted sparse representation in accordance with the map. 16. The apparatus of claim 14 , wherein, to append to the unsorted sparse representation, the processor executes the instructions to: in response to a determination that the hash value is absent from a map of the unsorted sparse representation, add the bucket identifier to the map. 17. The apparatus of claim 14 , wherein, to obtain the approximate unique count, the processor executes the instructions to: in response to a determination that a current memory allocation for the unsorted sparse representation and the m

Assignees

Thoughtspot Inc

Inventors

Classifications

G06F12/0864Primary
using pseudo-associative means, e.g. set-associative or hashing · CPC title
G06F9/5016
the resource being the memory · CPC title
G06F12/023
Free address space management · CPC title
G06F16/9014
hash tables · CPC title
G06F16/9017
using directory or table look-up (use of a directory or look-up table in file systems G06F16/13) · CPC title

Patent family

Related publications grouped by family.

View patent family 80628492

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11748264B1 cover?: Obtaining an approximate unique count for a column from a table from a database includes, generating, for a value from an unevaluated row, a hash value in a defined range of hash values, determining a cardinality of leading zeros in the hash value, identifying a bucket with respect to the hash value from a plurality of buckets corresponding to the defined range of hash values, wherein the bucke…
Who is the assignee on this patent?: Thoughtspot Inc
What technology area does this patent fall under?: Primary CPC classification G06F12/0864. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 05 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Systems and methods for cross media reporting by fast merging of data sources

Distinct value estimation for query planning

Method And System To Estimate The Cardinality Of Sets And Set Operation Results From Single And Multiple HyperLogLog Sketches

Efficient determination of join paths via cardinality estimation

Frequently asked questions