Methods and apparatus to estimate cardinality of users represented across multiple bloom filter arrays

US11741068B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11741068-B2
Application numberUS-202117362419-A
CountryUS
Kind codeB2
Filing dateJun 29, 2021
Priority dateJun 30, 2020
Publication dateAug 29, 2023
Grant dateAug 29, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatus to estimate cardinality of users represented across multiple bloom filter arrays are disclosed. Examples includes processor circuitry to execute and/or instantiate instructions to generate a first composite Bloom filter array based on first and second Bloom filter arrays. The processor circuitry is to generate a final composite Bloom filter array based on the first composite Bloom filter array and a third Bloom filter array. Different ones of the first, second, and third Bloom filter arrays representative of different sets of users who accessed media. The first, second, and third Bloom filter arrays including differential privacy noise. The processor circuitry to estimate a cardinality of a union of the first, second, and third Bloom filter arrays based on the final composite Bloom filter array.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: communication circuitry to: receive a first Bloom filter array via a first network communication from a first server of a first database proprietor; receive a second Bloom filter array via a second network communication from a second server of a second database proprietor; and receive a third Bloom filter array from at least one of the first server, the second server, or a third server of a third database proprietor; at least one memory; instructions in the apparatus; and programmable circuitry to execute and/or instantiate the instructions to: generate a first composite Bloom filter array based on the first and second Bloom filter arrays; generate noise arrays for the first composite Bloom filter array, the noise arrays distinct from the first Bloom filter array, distinct from the second Bloom filter array, and distinct from the first composite Bloom filter array; generate a final composite Bloom filter array based on the first composite Bloom filter array, the noise arrays, and the third Bloom filter array, different ones of the first, second, and third Bloom filter arrays representative of different sets of users who registered with respective ones of the first, second, and third database proprietors and who accessed media, the first, second, and third Bloom filter arrays including differential privacy noise, the first, second, and third Bloom filter arrays generated to maintain a privacy of the different sets of users such that an overlap between the different sets of users cannot be directly determined to enable determination of an audience size across the different sets of users; estimate a cardinality of a union of the first, second, and third Bloom filter arrays based on the final composite Bloom filter array; and cause the communication circuitry to transmit, via a third network communication, a report based on the cardinality to a third-party entity. 2. The apparatus of claim 1 , wherein the programmable circuitry is to: generate the first composite Bloom filter array based on a bit-wise union of the first and second Bloom filter arrays; and generate the final composite Bloom filter array based on a bit-wise union of the first composite Bloom filter array and the third Bloom filter array. 3. The apparatus of claim 1 , wherein the noise arrays define probabilities for bit-flipping values in individual elements of a latent Bloom filter array that would result in the first composite Bloom filter array, the latent Bloom filter array corresponding to a bit-wise union of the first and second Bloom filter arrays prior to addition of the differential privacy noise. 4. The apparatus of claim 1 , wherein the programmable circuitry is to: generate a binary tree having leaf nodes and non-leaf nodes, different ones of the leaf nodes to be associated with different ones of public Bloom filter arrays provided by corresponding database proprietors, the public Bloom filter arrays including the first, second, and third Bloom filter arrays, the corresponding database proprietors including at least two of the first, second, or third database proprietors, different ones of the non-leaf nodes to be associated with different ones of composite Bloom filter arrays, the different ones of the composite Bloom filter arrays including the first composite Bloom filter array and the final composite Bloom filter array; and generate the different ones of the composite Bloom filter arrays according to a topology of the binary tree. 5. The apparatus of claim 1 , wherein a number of the users is in at least the hundreds of thousands, and the first, second, and third Bloom filter arrays are hundreds of elements in length. 6. The apparatus of claim 3 , wherein the programmable circuitry is to generate the noise arrays based on first and second pairs of noise probability arrays associated with respective ones of the first and second Bloom filter arrays. 7. The apparatus of claim 4 , wherein the programmable circuitry is to dynamically construct the topology of the binary tree for successive ones of the composite Bloom filter arrays to be generated based on previously generated ones of the composite Bloom filter arrays. 8. The apparatus of claim 4 , wherein the programmable circuitry is to: determine effective noise errors for available Bloom filter arrays, the available Bloom filter arrays corresponding to ones of the public Bloom filter arrays and ones of the composite Bloom filter arrays that have not already been combined with other ones of the public Bloom filter arrays and have not already been combined with other ones of the composite Bloom filter arrays; and identify a pair of the available Bloom filter arrays to combine for a next one of the successive ones of the composite Bloom filter arrays based on the effective noise errors. 9. The apparatus of claim 4 , wherein the generation of the different ones of the composite Bloom filter arrays according to the topology of the binary tree is to facilitate analysis of the public Bloom filter arrays in a piecemeal fashion that avoids non-scalable complexity. 10. The apparatus of claim 6 , wherein the first and second Bloom filter arrays and respective ones of the first and second pairs of noise probability arrays are provided by respective ones of the first and second database proprietors. 11. The apparatus of claim 6 , wherein the first pair of noise probability arrays include first values, and the second pair of noise probability arrays include second values, the first values different than the second values. 12. The apparatus of claim 5 , wherein the programmable circuitry is to repeatedly estimate the cardinality of the union of the first, second, and third Bloom filter arrays at a frequency of at least once a day. 13. At least one non-transitory computer readable storage medium comprising instructions to cause programmable circuitry to at least: receive, via communication circuitry, a first Bloom filter array via a first network communication from a first server of a first database proprietor; receive, via the communication circuitry, a second Bloom filter array via a second network communication from a second server of a second database proprietor; and receive, via the communication circuitry, a third Bloom filter array from at least one of the first server, the second server, or a third server of a third database proprietor; generate a first composite Bloom filter array based on first and second Bloom filter arrays; generate noise arrays for the first composite Bloom filter array, the noise arrays distinct from the first Bloom filter array, distinct from the second Bloom filter array, and distinct from the first composite Bloom filter array; generate a final composite Bloom filter array based on the first composite Bloom filter array, the noise arrays, and the third Bloom filter array, different ones of the first, second, and third Bloom filter arrays representative of different sets of users who registered with respective ones of the first, second, and third database proprietors and who accessed media, the first, second, and third Bloom filter arrays including differential privacy noise, the first, second, and third Bloom filter arrays generated to maintain a privacy of the different sets of users such that an overlap between the different sets of users cannot be directly determined to enable determination of an audience size across the different sets of users; estimate a cardinality of a union of the first, second, and third Bloom filter arrays based on the final composite Bloom filter array; and cause the communication circuitry to transmit, via a third

Assignees

Inventors

Classifications

  • Hash tables · CPC title

  • comprising an array of processing units with common control, e.g. single instruction multiple data processors (G06F15/82 takes precedence {; for correlation function computation G06F17/15}) · CPC title

  • Trees, e.g. B+trees · CPC title

  • Join operations · CPC title

  • ASIC · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11741068B2 cover?
Methods and apparatus to estimate cardinality of users represented across multiple bloom filter arrays are disclosed. Examples includes processor circuitry to execute and/or instantiate instructions to generate a first composite Bloom filter array based on first and second Bloom filter arrays. The processor circuitry is to generate a final composite Bloom filter array based on the first composi…
Who is the assignee on this patent?
Nielsen Co Us Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/2255. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 29 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).