System and method for generating recommendations
US-2016300144-A1 · Oct 13, 2016 · US
US9785890B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9785890-B2 |
| Application number | US-201213572528-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 10, 2012 |
| Priority date | Aug 10, 2012 |
| Publication date | Oct 10, 2017 |
| Grant date | Oct 10, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Data for a plurality of entities that can be offered a plurality of products can be obtained. The data can include categorical data and numeric data. Based on business constraints, some of all of the data can be selected. The selected data can be converted to another set of numeric data, wherein the categorical values are converted to numeric values. Dimensions of the converted data can be reduced to generate another set of data. Based on this another set of data, clusters of entities can be formed. The products can be grouped by assigning a unique product identifier of each product to a corresponding cluster. This grouping of products can be used by a predictive model to predict a likelihood of an entity to purchase a particular product in a future time period. Related methods, apparatus, systems, techniques and articles are also described.
Opening claim text (preview).
What is claimed is: 1. A non-transitory computer program product storing instructions that, when executed by at least one programmable processor, cause the at least one programmable processor to perform operations comprising: obtaining data for a plurality of entities that are offered a plurality of products, the data comprising a first set of categorical data and a first set of numeric data; converting the data to a first set of data, the first set of data comprising a second set of numeric data, the converting of the data to the first set of data comprising: normalizing the first set of numeric data; determining, from the first set of categorical data, a base categorical attribute that is associated with a number of categorical values that is more than a number of categorical values associated with other categorical attributes; determining, from the first set of numeric data, a base numeric attribute that is associated with numeric values that have a sum of associated variances that is less than a sum of associated variances of numeric values of other numeric attributes; constructing, using co-occurrence of categorical values associated with categorical attributes and numeric values associated with numeric attributes, a co-occurrence matrix; calculating a similarity value associated with each pair of the categorical values; and assigning, to each categorical value associated with the base categorical attribute and to produce at least a first portion of the first set of data, a mean of corresponding numeric values of the base numeric attribute; selecting, from variables associated with the first set of data, one or more variables; reducing, based on the one or more variables, dimensions of the first set of data to generate a second set of data, the reducing of the dimensions by removing duplicate data from the first set of data and transforming a larger number of correlated attributes to a lesser number of linearly uncorrelated attributes; increasing storage capacity of the non-transitory computer program product and speed of processing of existing data in the non-transitory computer program product by the at least one programmable processor by associating the linearly uncorrelated attributes and the removal of the duplicate data; generating, based on the second set of data, clusters associated with corresponding entities; assigning a unique product identifier of each product to a corresponding cluster to generate groupings of products; and predicting, using the groupings of products and for one or more entities, a likelihood of an entity to purchase a particular product in a time period in future. 2. The computer program product of claim 1 , wherein the first set of categorical data comprises categorical values for categorical attributes comprising at least one of a gender and a residential status associated with one or more of a plurality of entities. 3. The computer program product of claim 1 , wherein the first set of numeric data comprises numeric values for numeric attributes comprising at least one of credit score, risk score and credit line utilization associated with one or more of a plurality of entities. 4. The computer program product of claim 1 , wherein the converting of the data to the first set of data comprises: associating a similarity value with the categorical values based on frequency of co-occurrence of categorical values in the data; and assigning numeric values to the categorical values based on the similarity value. 5. The computer program product of claim 1 , wherein the converting of the data to the first set of data further comprises: assigning, to each categorical value associated with other categorical attributes and to produce at least a second portion of the first set of data, a value characterized by: Σ i=1 d a i *v t , wherein: d is a number of base categorical values in the base categorical attribute, a i is the similarity value associated with the categorical value of the other categorical attributes and i th base categorical value of the base categorical attribute, and v i is the mean of corresponding numeric values of the base numeric attribute. 6. The computer program product of claim 5 , wherein the similarity value is characterized by: D XY = m ( X , Y ) m ( x ) + m ( Y ) - m ( X , Y ) wherein: X is a first categorical value; Y is a second categorical value; D XY is a similarity value characterizing similarity between the first categorical value X and the second categorical value Y; m(X) is a number of occurrences of the first categorical value X; m(Y) is a number of occurrences of the second categorical value Y; and m(X,Y) is a number of simultaneous occurrences of the first categorical value X and the second categorical value Y. 7. The computer program product of claim 5 , wherein the second set of numeric data comprises the first portion of the first set of data and the second portion of the first set of data. 8. The computer program product of claim 1 , wherein the first set of data excludes categorical data. 9. The computer program product of claim 1 , wherein the one or more variables characterize one or more characteristics of an entity. 10. The computer program product of claim 1 , wherein the one of more variables characterize at least one of life-stage and lifestyle of an entity. 11. The computer program product of claim 1 , wherein the reducing of the dimensions comprises: reducing, based on the one or more variables, the duplicate data from the first set of data to generate the second set of data, wherein the transforming of the larger number of correlated attributes to the lesser number of linearly uncorrelated attributes is performed using orthogonal transformation. 12. The computer program product of claim 11 , wherein the reducing of the duplicate data from the first set comprises removing some data associated with two or more attributes that characterize common information, wherein the orthogonal transformation is a part of a principal component analysis technique. 13. The computer program product of claim 12 , wherein the two or more attributes comprise income and value of property. 14. The computer program product of claim 1 , wherein the generating of the clusters associated with corresponding entities comprises: generating a plurality of points in space such that distance between any tw
Market modelling; Market analysis; Collecting market data · CPC title
Physics · mapped topic
Marketing; Price estimation or determination; Fundraising · CPC title
based on statistics · CPC title
Market predictions or forecasting for commercial activities · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.