Method and system for classifying user identifiers into similar segments

US11061937B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11061937-B2
Application numberUS-201816144715-A
CountryUS
Kind codeB2
Filing dateSep 27, 2018
Priority dateSep 27, 2018
Publication dateJul 13, 2021
Grant dateJul 13, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A database system performs lookalike analysis on a data set including a plurality of user identifiers, which are associated with one or more attribute records. The database system classifies the user identifiers into one or more segments of user identifiers based on the attribute records. The database system performs Linear Discriminant Analysis (LDA) to calculate a measure of importance of the attribute records relative to the one or more segments. The database system auto-correlates the attribute records based on the numbers of attribute records in the user identifier population and the one or more segments. The database system identifies a set of user identifiers relative to one or more segments using the measures of importance and the auto-correlated parameters.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for database processing, comprising: classifying a plurality of user identifiers into one or more segments based at least in part on attribute records associated with the plurality of user identifiers, the plurality of user identifiers being members of a user population; calculating a measure of importance for each of the attribute records relative to classification in the one or more segments of the plurality of user identifiers, the calculation being based at least in part on a probability that a particular user identifier is classified in each of the one or more segments given that either the particular user identifier is associated with each attribute record or that the particular user identifier is not associated with each attribute record; identifying, for a segment of the one or more segments, a degree of overlap for an attribute record with the segment, the degree of overlap based on a number of user identifiers in the segment associated with the attribute record relative to a total number of user identifiers in the user population that are associated with the attribute record; performing the identifying for each of the one or more segments and each of the attribute records relative to the one or more segments; determining an average of the degree of overlap the one or more segments; generating a correlation factor based on the average that reflects a correlation of attribute records in the user population relative to the one or more segments; adjusting the measure of importance for each of the attribute records relative to the one or more segments based on the correlation factor; and identifying a set of user identifiers that satisfies a similarity condition relative to the one or more segments, the set of user identifiers identified based at least in part on attribute records associated with the set of user identifiers and an adjusted measure of importance for each of the attribute records associated with the set of user identifiers. 2. The method of claim 1 , further comprising: receiving additional attribute records associated with the plurality of user identifiers; and calculating the measure of importance for each of the additional attribute records relative to classification in the one or more segments. 3. The method of claim 1 , further comprising: receiving additional user identifiers and additional attribute records associated with the additional user identifiers; modifying the user population based on using the additional attribute records associated with the additional user identifiers to generate a modified user population; and calculating the measure of importance for each of the additional attribute records based at least in part on a probability that a particular additional user identifier is classified in each of the one or more segments given that the particular additional user identifier is associated with each additional attribute record and given that the particular additional user identifier is not associated with each additional attribute record, wherein the set of user identifiers includes additional user identifiers and the identifying is based at least in part on the calculated measure of importance for each of the additional attribute records. 4. The method of claim 3 , wherein modifying the user population further comprises: identifying user identifiers that are associated with attribute records overlapping with the additional attribute records associated with the additional user identifiers; and selecting the user identifiers, as the modified user population, that are associated with the attribute records overlapping with the attribute records associated with the additional user identifiers. 5. The method of claim 1 , wherein generating the correlation factor further comprises: selecting correlation parameters based on a regression analysis; and calculating the correlation factor using the correlation parameters. 6. The method of claim 1 , further comprising: receiving additional attribute records associated with one or more of the plurality of user identifiers; modifying the attribute records associated with the plurality of user identifiers based on the additional attribute records responsive to determining that a delta of a number of user identifiers associated with the one or more segments does not satisfy a reach condition; and preventing modification of the attribute records associated with the plurality of user identifiers responsive to determining that that the delta of the number of user identifiers associated with the one or more segments satisfies the reach condition. 7. The method of claim 1 , wherein identifying the set of user identifiers that satisfy the similarity condition relative to the one or more segments further comprises: generating similarity scores for each of the plurality of user identifiers relative to the one or more segments. 8. The method of claim 1 , wherein the set of user identifiers are identified relative to a base segment of the one or more segments, the base segment based on one or more of the attribute records in the user population. 9. The method of claim 1 , wherein the similarity condition is selected by a user. 10. An apparatus for database processing, comprising: a processor, memory in electronic communication with the processor; and instructions stored in the memory and executable by the processor to cause the apparatus to: classify a plurality of user identifiers into one or more segments based at least in part on attribute records associated with the plurality of user identifiers, the plurality of user identifiers being members of a user population; calculate a measure of importance for each of the attribute records relative to classification in the one or more segments of the plurality of user identifiers, the calculation being based at least in part on a probability that a particular user identifier is classified in each of the one or more segments given that either the particular user identifier is associated with each attribute record or that the particular user identifier is not associated with each attribute record; identify, for a segment of the one or more segments, a degree of overlap for an attribute record with the segment, the degree of overlap based on a number of user identifiers in the segment associated with the attribute record relative to a total number of user identifiers in the user population that are associated with the attribute record; perform the identifying for each of the one or more segments and each of the attribute records relative to the one or more segments; determine an average of the degree of overlap the one or more segments; generate a correlation factor based on the average that reflects a correlation of attribute records in the user population relative to the one or more segments; adjust the measure of importance for each of the attribute records in the user population relative to the one or more segments based on the correlation factor; and identify a set of user identifiers that satisfies a similarity condition relative to the one or more segments, the set of user identifiers identified based at least in part on attribute records associated with the set of user identifiers and an adjusted measure of importance for each of the attribute records associated with the set of user identifiers. 11. The apparatus of claim 10 , wherein the instructions are further executable by the processor to cause the apparatus to: receive additional user identifiers and additional attribute records associated with the additional user identifiers; modify the user population based on the additional attribute records associate

Assignees

Inventors

Classifications

  • Search customisation based on user profiles and personalisation · CPC title

  • Market modelling; Market analysis; Collecting market data · CPC title

  • Clustering or classification · CPC title

  • Summarisation for human users · CPC title

  • Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11061937B2 cover?
A database system performs lookalike analysis on a data set including a plurality of user identifiers, which are associated with one or more attribute records. The database system classifies the user identifiers into one or more segments of user identifiers based on the attribute records. The database system performs Linear Discriminant Analysis (LDA) to calculate a measure of importance of the…
Who is the assignee on this patent?
Salesforce Com Inc
What technology area does this patent fall under?
Primary CPC classification G06Q30/0201. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 13 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).