Secure multi-party reach and frequency estimation
US-2022376887-A1 · Nov 24, 2022 · US
US12436952B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12436952-B2 |
| Application number | US-202418912386-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 10, 2024 |
| Priority date | Aug 30, 2021 |
| Publication date | Oct 7, 2025 |
| Grant date | Oct 7, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method for efficiently estimating the number of unique elements in a collection of elements comprises generating, via hash logic, hash values associated with the elements. The hash values specify bit positions within an array of bits. Hash values output from the hash logic conform to a geometric distribution such that bit positions of the array of bits corresponding to lower orders bits are more likely to be generated than bit positions corresponding to higher-order bits. Bits of the array of bits corresponding to the bit positions are set. The number of bits of the array of bits that are set is counted. Estimation logic estimates the number of unique elements of the collection of elements as a function of the number of bits of the array of bits that are set.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for efficiently estimating a number of unique elements in a first collection of elements, the method comprising: generating, via hash logic, hash values associated with the first collection of elements, wherein the hash values specify bit positions within an array of bits; setting bits of the array of bits corresponding to the bit positions; counting a number of bits of the array of bits that are set; and estimating, by estimation logic, a number of unique elements of the first collection of elements as a function of the number of bits of the array of bits that are set, wherein the estimation logic is configured to estimate the number of unique elements according to the function: n ^ = ( ce γ ρ ( 1 - ρ ) S - 1 2 ) - 1 , where S corresponds to the number of bits, γ is a Euler-Mascheroni constant of 0.577216, ρ specifies a geometric distribution of bits across the array of bits, and {circumflex over (n)} corresponds to an estimate of the number of unique elements of the first collection of elements. 2. The computer-implemented method according to claim 1 , wherein c=1. 3. The computer-implemented method according to claim 1 , wherein estimating the number of unique elements of the first collection of elements as a function of the number of bits of the array of bits that are set comprises estimating the number of unique elements of the first collection of elements as an unbiased function of the number of bits of the array of bits, and wherein: c = ( 1 - 1 2 log ( 2 ) log ( 1 - ρ ) ) . 4. The computer-implemented method according to claim 1 , wherein bit position L(x j ) associated with a particular element x j , is determined according to the function: L ( x j ) = ⌈ log ( 1 - h ( x j ) ) log ( 1 - ρ ) ⌉ where h corresponds to a hash function that hashes each x j uniformly at random to values in the unit interval. 5. The computer-implemented method according to claim 1 , wherein the array of bits corresponds to a first Bloom filter that is associated with the first collection of elements. 6. The computer-implemented method according to claim 5 , further comprising: receiving a second Bloom filter that is associated with a second collection of elements; merging bits of the first Bloom filter with corresponding bits of the second Bloom filter to obtain a merged Bloom filter; and estimating, by the estimation logic, a number of elements between the first collection of elements and the second collection that are unique as a function of the number of bits of the merged Bloom filter that are set. 7. The computer-implemented method according to claim 1 , wherein the method is performed by a server, wherein elements of the first collection of elements are associated with one or more individuals, wherein the computer-implemented method facilitates estimating a number of individuals that visited the server. 8. The computer-implemented method according to claim 7 , further comprising estimating a number of individuals that clicked on particular advertisements hosted by the server, and determining a cost-per-click associated with the particular advertisement as a function of the estimated number of individuals that clicked on the particular advertisements. 9. A computing system comprising: one or more processors; and a memory in communication with the one or more processors, wherein the memory stores instruction code that, when executed by the one or more processors, causes the computing system to perform operations comprising: generating, via hash logic, hash values associated with a first collection of elements, wherein the hash values specify bit positions within an array of bits; setting bits of the array of bits corresponding to the bit positions; counting a number of bits of the array of bits that are set; and estimating, by estimation logic, a number of unique elements of the first collection of elements as a function of the number of bits of the array of bits that are set, wherein the estimation logic is configured to estimate the number of unique elements according to the function: n ^ = ( ce γ
Multidimensional index structures · CPC title
Indexing; Web crawling techniques · CPC title
Approximate or statistical queries · CPC title
Traffic · CPC title
Hash tables · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.