Content recommendation system using a neural network language model
US-9934515-B1 · Apr 3, 2018 · US
US11954594B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11954594-B1 |
| Application number | US-202117315695-A |
| Country | US |
| Kind code | B1 |
| Filing date | May 10, 2021 |
| Priority date | Jun 5, 2015 |
| Publication date | Apr 9, 2024 |
| Grant date | Apr 9, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
This document generally describes a neural network training system, including one or more computers, that trains a recurrent neural network (RNN) to receive an input, e.g., an input sequence, and to generate a sequence of outputs from the input sequence. In some implementations, training can include, for each position after an initial position in a training target sequence, selecting a preceding output of the RNN to provide as input to the RNN at the position, including determining whether to select as the preceding output (i) a true output in a preceding position in the output order or (ii) a value derived from an output of the RNN for the preceding position in an output order generated in accordance with current values of the parameters of the recurrent neural network.
Opening claim text (preview).
What is claimed is: 1. A method for training a sequence generation model, wherein: for each particular position after an initial position in an output order of a target sequence, the sequence generation model is configured (i) to receive an input that is based on a set of preceding output scores that the sequence generation model generated for a prediction at a preceding position in the output order and (ii) to generate a current set of output scores for the particular position in the output order, and the current set of output scores for the particular position in the output order comprises a respective score for each of a plurality of possible predictions for the particular position, and the method comprising: obtaining a plurality of training data pairs for the sequence generation model, each training data pair comprising a training input and a training target sequence for the training input, each training target sequence comprising a respective plurality of true outputs arranged according to an output order; and training the sequence generation model on the training data pairs, comprising, for each training data pair and for each particular position after an initial position in the output order of the training target sequence of the training data pair, selecting an input to provide to the sequence generation model for generating an output at the particular position in the output order of the training target sequence, wherein the input is selected from a group comprising (i) a non-predicted input that is based on the true output from a preceding position in the output order of the training target sequence of the training data pair, and (ii) a predicted input that is based on a set of preceding output scores that the sequence generation model generated at a preceding model. 2. The method of claim 1 , wherein training the sequence generation model on the training data pairs comprises, for each training data pair and for each particular position after the initial position in the output order of the training target sequence of the training data pair: determining an error between the output generated by the sequence generation model at the particular position in the output order and the true output indicated by the training data pair for the particular position in the output order; and using the error to adjust values of the trainable parameters of the sequence generation model. 3. The method of claim 1 , wherein the predicted input from a preceding position in the output order is a prediction from the preceding position in the output order that scored highest among all possible predictions for the preceding position. 4. The method of claim 1 , wherein selecting the input to provide to the sequence generation model for generating the output at the particular position in the output order of the training target sequence comprises evaluating a stochastic function, wherein the stochastic function assigns a probability of 1−ε to the option of selecting the non-predicted input from the preceding position in the output order as the input to the sequence generation model at the particular position in the output order, and wherein the stochastic function assigns a probability of c to the option of selecting the predicted input from the preceding position in the output order as the input to the sequence generation model at the particular position in the output order. 5. The method of claim 4 , comprising increasing the value of c as training of the sequence generation model progresses, such that relatively lower values of c are applied earlier in the training of the sequence generation model and relatively higher values of c are applied later in the training of the sequence generation model. 6. The method of claim 5 , wherein increasing the value of c comprises increasing the value of c using linear decay. 7. The method of claim 5 , wherein increasing the value of c comprises increasing the value of c using exponential decay or inverse sigmoid decay. 8. A system comprising one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations for training a sequence generation model, wherein: for each particular position after an initial position in an output order of a target sequence, the sequence generation model is configured (i) to receive an input that is based on a set of preceding output scores that the sequence generation model generated for a prediction at a preceding position in the output order and (ii) to generate a current set of output scores for the particular position in the output order, and the current set of output scores for the particular position in the output order comprises a respective score for each of a plurality of possible predictions for the particular position, and the operations comprising: obtaining a plurality of training data pairs for the sequence generation model, each training data pair comprising a training input and a training target sequence for the training input, each training target sequence comprising a respective plurality of true outputs arranged according to an output order; and training the sequence generation model on the training data pairs, comprising, for each training data pair and for each particular position after an initial position in the output order of the training target sequence of the training data pair, selecting an input to provide to the sequence generation model for generating an output at the particular position in the output order of the training target sequence, wherein the input is selected from a group comprising (i) a non-predicted input that is based on the true output from a preceding position in the output order of the training target sequence of the training data pair, and (ii) a predicted input that is based on a set of preceding output scores that the sequence generation model generated at a preceding model. 9. The system of claim 8 , wherein training the sequence generation model on the training data pairs comprises, for each training data pair and for each particular position after the initial position in the output order of the training target sequence of the training data pair: determining an error between the output generated by the sequence generation model at the particular position in the output order and the true output indicated by the training data pair for the particular position in the output order; and using the error to adjust values of the trainable parameters of the sequence generation model. 10. The system of claim 8 , wherein the predicted input from a preceding position in the output order is a prediction from the preceding position in the output order that scored highest among all possible predictions for the preceding position. 11. The system of claim 8 , wherein selecting the input to provide to the sequence generation model for generating the output at the particular position in the output order of the training target sequence comprises evaluating a stochastic function, wherein the stochastic function assigns a probability of 1−ε to the option of selecting the non-predicted input from the preceding position in the output order as the input to the sequence generation model at the particular position in the output order, and wherein the stochastic function assigns a probability of c to the option of selecting the predicted input from the preceding position in the output order as the input to the sequence generation model at the particular position in the output order. 12. The system of claim 11 , wherein the operations comprise increasing the value of ε as
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Learning methods · CPC title
Supervised learning · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
Recurrent networks, e.g. Hopfield networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.