Method and system for estimating the cardinality of information

US12436952B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12436952-B2
Application numberUS-202418912386-A
CountryUS
Kind codeB2
Filing dateOct 10, 2024
Priority dateAug 30, 2021
Publication dateOct 7, 2025
Grant dateOct 7, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method for efficiently estimating the number of unique elements in a collection of elements comprises generating, via hash logic, hash values associated with the elements. The hash values specify bit positions within an array of bits. Hash values output from the hash logic conform to a geometric distribution such that bit positions of the array of bits corresponding to lower orders bits are more likely to be generated than bit positions corresponding to higher-order bits. Bits of the array of bits corresponding to the bit positions are set. The number of bits of the array of bits that are set is counted. Estimation logic estimates the number of unique elements of the collection of elements as a function of the number of bits of the array of bits that are set.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for efficiently estimating a number of unique elements in a first collection of elements, the method comprising: generating, via hash logic, hash values associated with the first collection of elements, wherein the hash values specify bit positions within an array of bits; setting bits of the array of bits corresponding to the bit positions; counting a number of bits of the array of bits that are set; and estimating, by estimation logic, a number of unique elements of the first collection of elements as a function of the number of bits of the array of bits that are set, wherein the estimation logic is configured to estimate the number of unique elements according to the function: n ^ = ( ce γ ⁢ ρ ⁡ ( 1 - ρ ) S - 1 2 ) - 1 , where S corresponds to the number of bits, γ is a Euler-Mascheroni constant of 0.577216, ρ specifies a geometric distribution of bits across the array of bits, and {circumflex over (n)} corresponds to an estimate of the number of unique elements of the first collection of elements. 2. The computer-implemented method according to claim 1 , wherein c=1. 3. The computer-implemented method according to claim 1 , wherein estimating the number of unique elements of the first collection of elements as a function of the number of bits of the array of bits that are set comprises estimating the number of unique elements of the first collection of elements as an unbiased function of the number of bits of the array of bits, and wherein: c = ( 1 - 1 2 ⁢ log ⁡ ( 2 ) ⁢ log ⁡ ( 1 - ρ ) ) . 4. The computer-implemented method according to claim 1 , wherein bit position L(x j ) associated with a particular element x j , is determined according to the function: L ⁡ ( x j ) = ⌈ log ⁡ ( 1 - h ⁡ ( x j ) ) log ⁡ ( 1 - ρ ) ⌉ where h corresponds to a hash function that hashes each x j uniformly at random to values in the unit interval. 5. The computer-implemented method according to claim 1 , wherein the array of bits corresponds to a first Bloom filter that is associated with the first collection of elements. 6. The computer-implemented method according to claim 5 , further comprising: receiving a second Bloom filter that is associated with a second collection of elements; merging bits of the first Bloom filter with corresponding bits of the second Bloom filter to obtain a merged Bloom filter; and estimating, by the estimation logic, a number of elements between the first collection of elements and the second collection that are unique as a function of the number of bits of the merged Bloom filter that are set. 7. The computer-implemented method according to claim 1 , wherein the method is performed by a server, wherein elements of the first collection of elements are associated with one or more individuals, wherein the computer-implemented method facilitates estimating a number of individuals that visited the server. 8. The computer-implemented method according to claim 7 , further comprising estimating a number of individuals that clicked on particular advertisements hosted by the server, and determining a cost-per-click associated with the particular advertisement as a function of the estimated number of individuals that clicked on the particular advertisements. 9. A computing system comprising: one or more processors; and a memory in communication with the one or more processors, wherein the memory stores instruction code that, when executed by the one or more processors, causes the computing system to perform operations comprising: generating, via hash logic, hash values associated with a first collection of elements, wherein the hash values specify bit positions within an array of bits; setting bits of the array of bits corresponding to the bit positions; counting a number of bits of the array of bits that are set; and estimating, by estimation logic, a number of unique elements of the first collection of elements as a function of the number of bits of the array of bits that are set, wherein the estimation logic is configured to estimate the number of unique elements according to the function: n ^ = ( ce γ

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12436952B2 cover?
A computer-implemented method for efficiently estimating the number of unique elements in a collection of elements comprises generating, via hash logic, hash values associated with the elements. The hash values specify bit positions within an array of bits. Hash values output from the hash logic conform to a geometric distribution such that bit positions of the array of bits corresponding to lo…
Who is the assignee on this patent?
Nielsen Co Us Llc
What technology area does this patent fall under?
Primary CPC classification G06Q30/0246. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).