Demographic and media preference prediction using media content data analysis

US9406072B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9406072-B2
Application numberUS-201414208363-A
CountryUS
Kind codeB2
Filing dateMar 13, 2014
Priority dateMar 29, 2012
Publication dateAug 2, 2016
Grant dateAug 2, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems and computer program products are provided for predicting data. A name or title is obtained from a taste profile. There is an index into a data set based on the name or title, and a set of terms and corresponding term weights associated with the name or title are retrieved. A sparse vector is constructed based on the set of terms and term weights. The sparse vector is input to a training model including target data. The target data includes a subset of test data which has a correspondence to a predetermined target metric of data. A respective binary value and confidence level is output for each term, corresponding to an association between the term and the target metric.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for predicting data, comprising: a processor configured to: obtain a name or title from a taste profile; index into a data set based on the name or the title, and retrieve a set of descriptive terms and corresponding term weights associated with the name or the title; construct a sparse vector based on the set of descriptive terms and term weights; identify a target brand or segment of interest; generate a first list of accounts who follow the brand or segment of interest by examining social media data, and a second list of additional entities followed by accounts in the first list; filter the second list through a space mapping that maps entities to names or titles from taste profiles, to generate a subset of test data having a correspondence to the target brand or segment of interest; input the sparse vector to a training model including target data, wherein the target data includes the subset of test data having a correspondence to the target brand or segment of interest and the training model is based on a machine learning from ground truths from a selection of the target data, and output a respective binary value and confidence level for each descriptive term above a threshold, corresponding to an association between the descriptive term and the target brand or segment of interest. 2. The system according to claim 1 , wherein the first list of accounts is generated by selecting accounts who follow the target brand or segment of interest in a random sampling fashion. 3. The system according to claim 1 , wherein the processor is further configured to identify and eliminate false or suspicious accounts from the first list. 4. The system according to claim 3 , wherein an account is identified as false or suspicious when the account has less than a threshold level of activity. 5. The system according to claim 1 , wherein the taste profile includes multiple names or titles, and wherein the sparse vector is constructed from terms in the data set corresponding to all names or titles in the taste profile. 6. The system according to claim 1 , wherein the processor is further configured to filter the output to only those terms with a confidence level above a set threshold. 7. The system according to claim 1 , wherein the test data comprises a set of data determined to be associated with the target metric, as a ground truth for the learning model. 8. The system according to claim 1 , wherein the processor is further configured to resolve a name or title from multiple different textual representations. 9. The system according to claim 1 , wherein one entity can have multiple taste profiles, or multiple entities can share a single taste profile. 10. A method for predicting data, comprising: obtaining a name or title from a taste profile; indexing into a data set based on the name or the title, and retrieve a set of descriptive terms and corresponding term weights associated with the name or the title; constructing a sparse vector based on the set of descriptive terms and term weights; identifying a target brand or segment of interest; generating a first list of accounts who follow the brand or segment of interest by examining social media data, and a second list of additional entities followed by accounts in the first list; filtering the second list through a space mapping that maps entities to names or titles from taste profiles, to generate a subset of test data having a correspondence to the target brand or segment of interest; inputting the sparse vector to a training model including target data, wherein the target data includes the subset of test data having a correspondence to the target brand or segment of interest and the training model is based on a machine learning from ground truths from a selection of the target data, and outputting a respective binary value and confidence level for each descriptive term above a threshold, corresponding to an association between the descriptive term and the target brand or segment of interest. 11. The method according to claim 10 , wherein the first list of accounts is generated by selecting accounts who follow the target brand or segment of interest in a random sampling fashion. 12. The method according to claim 10 , further comprising identifying and eliminating false or suspicious accounts from the first list. 13. The method according to claim 12 , wherein an account is identified as false or suspicious when the account has less than a threshold level of activity. 14. The method according to claim 10 , wherein the taste profile includes multiple names or titles, and wherein the sparse vector is constructed from terms in the data set corresponding to all names or titles in the taste profile. 15. The method according to claim 10 , wherein the method further includes filtering the output to only those terms with a confidence level above a set threshold. 16. The method according to claim 10 , wherein the test data comprises a set of data determined to be associated with the target metric, as a ground truth for the learning model. 17. The method according to claim 10 , wherein the method further includes resolving a name or title from multiple different textual representations. 18. The method according to claim 10 , wherein one entity can have multiple taste profiles, or multiple entities can share a single taste profile. 19. A non-transitory computer-readable medium having stored thereon sequences of instructions, the sequences of instructions including instructions which when executed by a computer system causes the computer system to perform a method for predicting data, the method comprising: obtaining a name or title from a taste profile; indexing into a data set based on the name or the title, and retrieve a set of descriptive terms and corresponding term weights associated with the name or the title; constructing a sparse vector based on the set of descriptive terms and term weights; identifying a target brand or segment of interest; generating a first list of accounts who follow the brand or segment of interest by examining social media data, and a second list of what additional entities are followed by accounts in the first list; filtering the second list through a space mapping that maps entities to names or titles from taste profiles, to generate a subset of test data having a correspondence to the target brand or segment of interest; inputting the sparse vector to a training model including target data, wherein the target data includes the subset of test data having a correspondence to the target brand or segment of interest and the training model is based on a machine learning from ground truths from a selection of the target data, and outputting a respective binary value and confidence level for each descriptive term above a threshold, corresponding to an association between the descriptive term and the target brand or segment of interest. 20. The computer-readable medium according to claim 19 , wherein the first list of accounts is generated by selecting accounts who follow the target brand or segment of interest in a random sampling fashion.

Assignees

Inventors

Classifications

  • Filtering based on additional data, e.g. user or group profiles · CPC title

  • G06Q30/02Primary

    Marketing; Price estimation or determination; Fundraising · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9406072B2 cover?
Methods, systems and computer program products are provided for predicting data. A name or title is obtained from a taste profile. There is an index into a data set based on the name or title, and a set of terms and corresponding term weights associated with the name or title are retrieved. A sparse vector is constructed based on the set of terms and term weights. The sparse vector is input to …
Who is the assignee on this patent?
The Echo Nest Corp, Spotify Ab
What technology area does this patent fall under?
Primary CPC classification G06Q30/02. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 02 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).