Probabilistic clustering of an item

US9852193B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9852193-B2
Application numberUS-69488510-A
CountryUS
Kind codeB2
Filing dateJan 27, 2010
Priority dateAug 10, 2009
Publication dateDec 26, 2017
Grant dateDec 26, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A clustering and recommendation machine determines that an item is included in a cluster of items. The machine accesses item data descriptive of the item. The machine accesses a vector that represents the cluster and calculates the likelihood that the item is included in the cluster, based on the item variable and the probability parameter. The machine determines that the item is included in the cluster, based on the likelihood. The machine also recommends an item to a potential buyer. The machine accesses behavior data that represents a first event type pertinent to a first cluster of items. The machine calculates a probability that a second event type pertaining to a second cluster of items will co-occur with the first event type. The machine identifies an item from the second cluster to be recommended and presents a recommendation of the item to the potential buyer.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: accessing item data that corresponds to an item available for sale and inclusive of an item variable that represents information describing an attribute of the item, the item variable being a binary variable that represents an occurrence of a term in textual data descriptive of the item available for sale; accessing a vector that is representative of a cluster of items having attributes corresponding to the attribute of the item, the vector representing a product of which the item is a specimen, the vector including a probability distribution that is pertinent to the cluster of items and represented as a plurality of probability parameters that include a probability parameter modeling the binary variable that represents the occurrence of the term in the textual data descriptive of the item available for sale; calculating a result based on the binary variable that represents the occurrence of the term in the textual data descriptive of the item available for sale including calculating a logarithm based on the probability parameter modeling the binary variable and calculating the result based on the logarithm, the result representing a likelihood that the item is a specimen of the product represented by the cluster of items having attributes corresponding to the attribute of the item and represented by the vector, the calculating being performed by a module implemented using a processor of a machine; determining that the item is included in the cluster based on the result; and presenting a recommendation of the item available for sale to a potential buyer of the item, the recommendation indicating that the item is the specimen of the product represented by the cluster of items. 2. The computer-implemented method of claim 1 , further comprising accessing item data that includes at least one of: a categorical variable representing an attribute pertinent to the item; or a continuous variable representing a number pertinent to the item. 3. The computer-implemented method of claim 1 , wherein the probability distribution is selected from a group consisting of: a binomial distribution; a multinomial distribution; and a Gaussian distribution. 4. The computer-implemented method of claim 1 , wherein the probability parameter is selected from a group consisting of: a Bernoulli success probability; a multinomial parameter; and a Gaussian mean. 5. The computer-implemented method of claim 1 , wherein the calculating of the result includes calculating a sum based on the logarithm and based on the item variable. 6. The computer-implemented method of claim 1 , wherein the calculating of the result includes calculating a sum based on the logarithm and not based on the item variable. 7. The computer-implemented method of claim 1 further comprising: calculating a sum based on the result, the sum representing a total logarithmic likelihood that a plurality of items is included in the cluster; calculating an argument of a maximum of the sum; and estimating the vector representative of the cluster based on the argument of the maximum. 8. The computer-implemented method of claim 1 , wherein the item data is generated by a seller of the item, the method further comprising receiving the item data. 9. The computer-implemented method of claim 1 , wherein the item data includes at least one of a title, a description, an attribute name, an attribute value, a price, or a size of the item. 10. The computer-implemented method of claim 1 further comprising: storing a map file that includes a correspondence between the item and the cluster; and performing a lookup operation using the map file in response to a query pertinent to at least one of the item or the product represented by the cluster of items. 11. The computer-implemented method of claim 1 further comprising: determining a size of the cluster of items, the size indicating a number of specimens of the product represented by the cluster; performing a comparison of the size to a threshold; and determining that the cluster is to be discarded based on the comparison. 12. A system comprising: an access module, implemented using at least one hardware processor, configured to: access item data that corresponds to an item available for sale and inclusive of an item variable that represents information describing an attribute of the item, the item variable being a binary variable that represents an occurrence of a term in textual data descriptive of the item available for sale; and access a vector that is representative of a cluster of items having attributes corresponding to the attribute of the item, the vector representing a product of which the item is a specimen, the vector including a probability distribution that is pertinent to the cluster of items and represented as a plurality of probability parameters that include a probability parameter modeling the binary variable that represents the occurrence of the term in the textual data descriptive of the item available for sale; a processor configured by a probability module that configures the processor to calculate a result based on the binary variable that represents the occurrence of the term in the textual data descriptive of the item available for sale including calculating a logarithm based on the probability parameter modeling the binary variable and calculating the result based on the logarithm, the result representing a likelihood that the item is a specimen of the product represented by the cluster of items having attributes corresponding to the attribute of the item and represented by the vector; a determination module to determine that the item is included in the cluster based on the result; and a recommendation module to present a recommendation of the item available for sale to a potential buyer of the item, the recommendation indicating that the item is the specimen of the product represented by the cluster of items. 13. The system of claim 12 , wherein the probability module, in calculating the result, is to calculate a sum based on the logarithm and not based on the item variable. 14. The system of claim 12 further comprising an estimation module to calculate a sum based on the result, the sum representing a total logarithmic likelihood that a plurality of items is included in the cluster; calculate an argument of a maximum of the sum; and estimate the vector representative of the cluster based on the argument of the maximum. 15. A non-transitory machine-readable storage medium comprising instructions that, when executed by one or more processors of a machine, cause the machine to perform operations comprising: accessing item data that corresponds to an item available for sale and inclusive of an item variable that represents information describing an attribute of the item, the item variable being a binary variable that represents an occurrence of a term in textual data descriptive of the item available for sale; accessing a vector that is representative of a cluster of items having attributes corresponding to the attribute of the item, the vector representing a product of which the item is a specimen, the vector including a probability distribution that is pertinent to the cluster of items and represented as a plurality of probability parameters that include a probability parameter modeling the binary variable that represents the occurrence of the term in the textual data descriptive of the item available for sale; calculating a result based on the binary variable that represents the occurrence of the term in the textual d

Assignees

Inventors

Classifications

  • using statistics or function optimisation, e.g. modelling of probability density functions · CPC title

  • Approximate or statistical queries · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9852193B2 cover?
A clustering and recommendation machine determines that an item is included in a cluster of items. The machine accesses item data descriptive of the item. The machine accesses a vector that represents the cluster and calculates the likelihood that the item is included in the cluster, based on the item variable and the probability parameter. The machine determines that the item is included in th…
Who is the assignee on this patent?
Chen Ye, Canny John, Ebay Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2462. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 26 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).