Semi-supervised identity aggregation of profiles using statistical methods
US-9654594-B2 · May 16, 2017 · US
US2016147758A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016147758-A1 |
| Application number | US-201414551365-A |
| Country | US |
| Kind code | A1 |
| Filing date | Nov 24, 2014 |
| Priority date | Nov 24, 2014 |
| Publication date | May 26, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are disclosed for identifying the same online user across different communication networks, and further creating a unified profile for that user. The unified profile is an aggregation of publicly available user profile attributes across the different networks. In an embodiment, the techniques are implemented as a computer implemented methodology, including: (1) feature space analysis to identify relevant user features that allows for clusterization of the given target network(s), (2) unsupervised candidate selection to identify one or more candidate user profiles from each target network and that are likely belonging to a target user or so-called queried user, and (3) supervised user identification to identify a likely matching user profile for that target user from each target network. A unified user profile can then be built from data taken from all matched user profiles, and effectively allows a marketer to better understand that user and hence execute more informed targeting.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method, comprising: receiving a target user query including an online user profile on a network A, the user profile having a feature; using the feature to segment a clustered target network B, at least one cluster including multiple user profiles each including one or more features, thereby generating a set of candidate user profiles on the target network B; ranking, via a supervised classifier, the candidate user profiles included in the set, so as to identify a single match candidate user profile for the target user query; and generating a unified user profile that includes features from the user profile on network A and the single match candidate user profile on network B. 2 . The method of claim 1 , further comprising: repeating the using, ranking, and generating for each of one or more additional networks, thereby further supplementing the unified user profile to further include features from at least one of the additional networks. 3 . The method of claim 1 , further comprising: pre-processing network B, in advance of receiving the target user query, thereby generating a set of clusters including the at least one cluster including multiple user profiles. 4 . The method of claim 3 wherein the pre-processing of network B is carried out on a periodic basis. 5 . The method of claim 3 , further comprising: pre-processing one or more additional networks, in advance of receiving the target user query, thereby generating a set of clusters for each of those networks. 6 . The method of claim 1 wherein the candidate user profiles in the set from network B are included in a single cluster, or a single cluster and one or more sibling clusters, a sibling cluster being a cluster within a pre-established mathematical distance measure of the single cluster. 7 . The method of claim 6 wherein each cluster of network B has a centroid having a mathematical distance from the user profile on network A, and the centroid of the single cluster is associated with the minimum of those mathematical distances. 8 . The method of claim 7 wherein the centroid of a given cluster is defined as the average frequency distribution of characters of a feature in each user profile of that cluster. 9 . The method of claim 7 wherein the mathematical distance is the square of the Euclidean distance between the frequency distribution of the user profile on network A and each cluster of network B. 10 . The method of claim 1 wherein the ranking includes assigning match probabilities (scores) to each of the candidate user profiles, and identifying a best match based on the scores assigned. 11 . A non-transient computer program product having instructions encoded thereon that when executed by one or more processors causes a process to be carried out, the process comprising: receiving a target user query including an online user profile on a network A, the user profile having a feature; using the feature to segment a clustered target network B, at least one cluster including multiple user profiles each including one or more features, thereby generating a set of candidate user profiles on the target network B; ranking, via a supervised classifier, the candidate user profiles included in the set, so as to identify a single match candidate user profile for the target user query; and generating a unified user profile that includes features from the user profile on network A and the single match candidate user profile on network B. 12 . The computer program product of claim 11 , the process further comprising: repeating the using, ranking, and generating for each of one or more additional networks, thereby further supplementing the unified user profile to further include features from at least one of the additional networks. 13 . The computer program product of claim 11 , further comprising: pre-processing network B, in advance of receiving the target user query, thereby generating a set of clusters including the at least one cluster including multiple user profiles. 14 . The computer program product of claim 11 wherein the candidate user profiles in the set from network B are included in a single cluster, or a single cluster and one or more sibling clusters, a sibling cluster being a cluster within a pre-established mathematical distance measure of the single cluster. 15 . The computer program product of claim 14 wherein each cluster of network B has a centroid having a mathematical distance from the user profile on network A, and the centroid of the single cluster is associated with the minimum of those mathematical distances. 16 . The computer program product of claim 11 wherein the ranking includes assigning match probabilities (scores) to each of the candidate user profiles, and identifying a best match based on the scores assigned. 17 . A computing system, comprising: an electronic memory for storing executable instructions; a processor configured to execute the instructions to: receive a target user query including an online user profile on a network A, the user profile having a feature; use the feature to segment a clustered target network B, at least one cluster including multiple user profiles each including one or more features, thereby generating a set of candidate user profiles on the target network B; rank, via a supervised classifier, the candidate user profiles included in the set, so as to identify a single match candidate user profile for the target user query; and generate a unified user profile that includes features from the user profile on network A and the single match candidate user profile on network B. 18 . The system of claim 17 wherein the processor is further configured to execute the instructions to: pre-process network B, in advance of receiving the target user query, thereby generating a set of clusters including the at least one cluster including multiple user profiles; store the clusters in a cloud-based storage; and periodically repeat the pre-processing of network B to update the clusters in the cloud-based storage as needed. 19 . The system of claim 17 wherein: the candidate user profiles in the set from network B are included in a single cluster, or a single cluster and one or more sibling clusters, a sibling cluster being a cluster within a pre-established mathematical distance measure of the single cluster; and each cluster of network B has a centroid having a mathematical distance from the user profile on network A, and the centroid of the single cluster is associated with the minimum of those mathematical distances. 20 . The system of claim 17 wherein the ranking includes assigning match probabilities (scores) to each of the candidate user profiles, and identifying a best match based on the scores assigned.
Business processes related to social networking or social networking services · CPC title
Query execution (filtering based on additional data G06F16/335) · CPC title
Search customisation based on social or collaborative filtering · CPC title
Search customisation based on user profiles and personalisation · CPC title
based on user profile or attribute · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.