Data-driven product grouping

US9785890B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9785890-B2
Application numberUS-201213572528-A
CountryUS
Kind codeB2
Filing dateAug 10, 2012
Priority dateAug 10, 2012
Publication dateOct 10, 2017
Grant dateOct 10, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Data for a plurality of entities that can be offered a plurality of products can be obtained. The data can include categorical data and numeric data. Based on business constraints, some of all of the data can be selected. The selected data can be converted to another set of numeric data, wherein the categorical values are converted to numeric values. Dimensions of the converted data can be reduced to generate another set of data. Based on this another set of data, clusters of entities can be formed. The products can be grouped by assigning a unique product identifier of each product to a corresponding cluster. This grouping of products can be used by a predictive model to predict a likelihood of an entity to purchase a particular product in a future time period. Related methods, apparatus, systems, techniques and articles are also described.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory computer program product storing instructions that, when executed by at least one programmable processor, cause the at least one programmable processor to perform operations comprising: obtaining data for a plurality of entities that are offered a plurality of products, the data comprising a first set of categorical data and a first set of numeric data; converting the data to a first set of data, the first set of data comprising a second set of numeric data, the converting of the data to the first set of data comprising: normalizing the first set of numeric data; determining, from the first set of categorical data, a base categorical attribute that is associated with a number of categorical values that is more than a number of categorical values associated with other categorical attributes; determining, from the first set of numeric data, a base numeric attribute that is associated with numeric values that have a sum of associated variances that is less than a sum of associated variances of numeric values of other numeric attributes; constructing, using co-occurrence of categorical values associated with categorical attributes and numeric values associated with numeric attributes, a co-occurrence matrix; calculating a similarity value associated with each pair of the categorical values; and assigning, to each categorical value associated with the base categorical attribute and to produce at least a first portion of the first set of data, a mean of corresponding numeric values of the base numeric attribute; selecting, from variables associated with the first set of data, one or more variables; reducing, based on the one or more variables, dimensions of the first set of data to generate a second set of data, the reducing of the dimensions by removing duplicate data from the first set of data and transforming a larger number of correlated attributes to a lesser number of linearly uncorrelated attributes; increasing storage capacity of the non-transitory computer program product and speed of processing of existing data in the non-transitory computer program product by the at least one programmable processor by associating the linearly uncorrelated attributes and the removal of the duplicate data; generating, based on the second set of data, clusters associated with corresponding entities; assigning a unique product identifier of each product to a corresponding cluster to generate groupings of products; and predicting, using the groupings of products and for one or more entities, a likelihood of an entity to purchase a particular product in a time period in future. 2. The computer program product of claim 1 , wherein the first set of categorical data comprises categorical values for categorical attributes comprising at least one of a gender and a residential status associated with one or more of a plurality of entities. 3. The computer program product of claim 1 , wherein the first set of numeric data comprises numeric values for numeric attributes comprising at least one of credit score, risk score and credit line utilization associated with one or more of a plurality of entities. 4. The computer program product of claim 1 , wherein the converting of the data to the first set of data comprises: associating a similarity value with the categorical values based on frequency of co-occurrence of categorical values in the data; and assigning numeric values to the categorical values based on the similarity value. 5. The computer program product of claim 1 , wherein the converting of the data to the first set of data further comprises: assigning, to each categorical value associated with other categorical attributes and to produce at least a second portion of the first set of data, a value characterized by: Σ i=1 d a i *v t , wherein: d is a number of base categorical values in the base categorical attribute, a i is the similarity value associated with the categorical value of the other categorical attributes and i th base categorical value of the base categorical attribute, and v i is the mean of corresponding numeric values of the base numeric attribute. 6. The computer program product of claim 5 , wherein the similarity value is characterized by: D XY = m ⁡ ( X , Y ) m ⁡ ( x ) + m ⁡ ( Y ) - m ⁡ ( X , Y ) wherein: X is a first categorical value; Y is a second categorical value; D XY is a similarity value characterizing similarity between the first categorical value X and the second categorical value Y; m(X) is a number of occurrences of the first categorical value X; m(Y) is a number of occurrences of the second categorical value Y; and m(X,Y) is a number of simultaneous occurrences of the first categorical value X and the second categorical value Y. 7. The computer program product of claim 5 , wherein the second set of numeric data comprises the first portion of the first set of data and the second portion of the first set of data. 8. The computer program product of claim 1 , wherein the first set of data excludes categorical data. 9. The computer program product of claim 1 , wherein the one or more variables characterize one or more characteristics of an entity. 10. The computer program product of claim 1 , wherein the one of more variables characterize at least one of life-stage and lifestyle of an entity. 11. The computer program product of claim 1 , wherein the reducing of the dimensions comprises: reducing, based on the one or more variables, the duplicate data from the first set of data to generate the second set of data, wherein the transforming of the larger number of correlated attributes to the lesser number of linearly uncorrelated attributes is performed using orthogonal transformation. 12. The computer program product of claim 11 , wherein the reducing of the duplicate data from the first set comprises removing some data associated with two or more attributes that characterize common information, wherein the orthogonal transformation is a part of a principal component analysis technique. 13. The computer program product of claim 12 , wherein the two or more attributes comprise income and value of property. 14. The computer program product of claim 1 , wherein the generating of the clusters associated with corresponding entities comprises: generating a plurality of points in space such that distance between any tw

Assignees

Inventors

Classifications

  • Market modelling; Market analysis; Collecting market data · CPC title

  • G06N99/005Primary

    Physics · mapped topic

  • Marketing; Price estimation or determination; Fundraising · CPC title

  • based on statistics · CPC title

  • Market predictions or forecasting for commercial activities · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9785890B2 cover?
Data for a plurality of entities that can be offered a plurality of products can be obtained. The data can include categorical data and numeric data. Based on business constraints, some of all of the data can be selected. The selected data can be converted to another set of numeric data, wherein the categorical values are converted to numeric values. Dimensions of the converted data can be redu…
Who is the assignee on this patent?
Sowani Amit, Malhotra Eeshan, Rahman Shafi Ur, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06N99/005. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 10 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).