Gaussian ranking using matrix factorization

US10180968B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10180968-B2
Application numberUS-201615044020-A
CountryUS
Kind codeB2
Filing dateFeb 15, 2016
Priority dateJul 23, 2015
Publication dateJan 15, 2019
Grant dateJan 15, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment of the present invention, a training engine teaches a matrix factorization model to rank items for users based on implicit feedback data and a rank loss function. In operation, the training engine approximates a distribution of scores to corresponding ranks as an approximately Gaussian distribution. Based on this distribution, the training engine selects an activation function that smoothly maps between scores and ranks. To train the matrix factorization model, the training engine directly optimizes the rank loss function based on the activation function and implicit feedback data. By contrast, conventional training engines that optimize approximations of the rank loss function are typically less efficient and produce less accurate ranking models.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method, comprising: identifying a plurality of items that are associated with a plurality of scores generated by one or more users; determining an activation function that maps the plurality of scores and a plurality of ranks associated with the plurality of items based on an approximate Gaussian distribution of the plurality of scores; computing, based on a matrix factorization model, a first predicted score for input data associated with a first item included in the plurality of items; computing a first value of a rank loss function based on the first predicted score and the activation function; modifying one or more elements included in the matrix factorization model based on the first value of the rank loss function; and generating a list of scored items by applying the modified matrix factorization model to the plurality of items. 2. The method of claim 1 , wherein modifying the one or more elements comprises: computing a derivative of the rank loss function at a first element included in the one or more elements; computing a delta from the first element, wherein the delta is in a direction associated with a negative gradient of the rank loss function based on the derivative of the rank loss function; and adding the delta to the first element. 3. The method of claim 2 , wherein computing the derivative of the rank loss function comprises: computing a value of the activation function at the first predicted score; and computing a derivative of the rank loss function at the first element based on the value of the activation function. 4. The method of claim 1 , wherein the matrix factorization model includes a first matrix and a second matrix, the one or more elements comprises a first element that is included in the first matrix, and further comprising: computing, based on the matrix factorization model, a second predicted score for input data associated with a second item included in the plurality of items; modifying a second element included in the first matrix based on the second predicted score; computing a second value of the rank loss function based on the second predicted score and the activation function; and modifying a third element included in the second matrix based on the first value of the rank loss function and the second value of the rank loss function. 5. The method of claim 4 , wherein the first item is associated with positive feedback from a user and the second item is not associated with any feedback from the user. 6. The method of claim 1 , wherein the activation function comprises a nonlinear approximation to a cumulative distribution function derived from the approximate Gaussian distribution. 7. The method of claim 6 , wherein the nonlinear approximation comprises a logistic function. 8. The method of claim 1 , wherein the rank loss function measures a normalized Discounted Cumulative Gain (nDCG) metric. 9. The method of claim 1 , wherein modifying the one or more elements comprises executing one or more stochastic gradient descent optimization operations on the matrix factorization model to minimize the rank loss function. 10. A computer-implemented computer-readable storage medium including instructions that, when executed by a processor, cause the processor to perform the steps of: identifying a plurality of items that are associated with a plurality of scores generated by one or more users; computing, based on a matrix factorization model, one or more predicted scores for input data associated with one or more items included in the plurality of items; determining a mapping of the plurality of scores and one or more corresponding ranks associated with the plurality of items based on an expected approximate distribution of the plurality of scores; and modifying one or more elements included in the matrix factorization model to minimize a training objective based on the mapping, wherein the training objective includes a rank loss function; and generating a list of scored items by applying the modified matrix factorization model to the plurality of items. 11. The computer-readable storage medium of claim 10 , wherein modifying the one or more elements comprises: computing a derivative of the rank loss function at a first element included in the one or more elements; computing a delta from the first element, wherein the delta is in a direction associated with a decreasing gradient of the rank loss function based on the derivative of the rank loss function; and adding the delta to the first element. 12. The computer-readable storage medium of claim 10 , wherein the mapping comprises a nonlinear approximation to the expected approximate distribution. 13. The computer-readable storage medium of claim 12 , wherein the nonlinear approximation comprises a piecewise quadratic function. 14. The computer-readable storage medium of claim 10 , wherein the rank loss function measures an area under a Receiver Operating Characteristic curve. 15. The computer-readable storage medium of claim 10 , wherein modifying the one or more elements comprises executing one or more listwise optimization operations on the matrix factorization model. 16. The computer-readable storage medium of claim 10 , wherein the matrix factorization model includes a first matrix and a second matrix, and generating the list of scored items comprises: multiplying the first matrix with an input vector associated with a first user and the plurality of items to generate a latent user vector; multiplying the second matrix with the latent user vector to generate an updated predicted score for each item included in the plurality of items; and ranking each item in the plurality of items based on the plurality of updated predicted scores. 17. A system, comprising: at least one memory storing instructions associated with a matrix factorization model and a training engine; and a processor that is coupled to the at least one memory and, when executing the instructions, is configured to: identify a plurality of items that are associated with a plurality of scores generated by one or more users; determine an activation function that maps the plurality of scores and a plurality of ranks associated with the plurality of items based on an approximate Gaussian distribution of the plurality of scores; compute, based on a matrix factorization model, a first predicted score for input data associated with a first item included in the plurality of items; compute a derivative of a rank loss function based on the first predicted score and the activation function; modify one or more elements included in the matrix factorization model based on the derivative of the rank loss function to minimize discrepancies between the plurality of scores and the plurality of ranks; and generate a list of scored items by applying the modified matrix factorization model to the plurality of items. 18. The system of claim 17 , wherein the training engine is configured to compute the derivative of the rank loss function by applying a chain rule to a derivative of the activation function at the first predicted score. 19. The system of claim 17 , wherein the activation function comprises a nonlinear approximation to a cumulative distribution function derived from the approximate Gaussian distribution. 20. The system of claim 17 , wherein the rank loss function measures a normalized Discounted Cumulative Gain (nDCG) metric.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10180968B2 cover?
In one embodiment of the present invention, a training engine teaches a matrix factorization model to rank items for users based on implicit feedback data and a rank loss function. In operation, the training engine approximates a distribution of scores to corresponding ranks as an approximately Gaussian distribution. Based on this distribution, the training engine selects an activation function…
Who is the assignee on this patent?
Netflix Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/3053. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 15 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).