Temporal features in a messaging platform
US-9117227-B1 · Aug 25, 2015 · US
US9519858B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9519858-B2 |
| Application number | US-201313763701-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 10, 2013 |
| Priority date | Feb 10, 2013 |
| Publication date | Dec 13, 2016 |
| Grant date | Dec 13, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system is described herein which uses a neural network having an input layer that accepts an input vector and a feature vector. The input vector represents at least part of input information, such as, but not limited to, a word or phrase in a sequence of input words. The feature vector provides supplemental information pertaining to the input information. The neural network produces an output vector based on the input vector and the feature vector. In one implementation, the neural network is a recurrent neural network. Also described herein are various applications of the system, including a machine translation application.
Opening claim text (preview).
What is claimed is: 1. A method performed using one or more processing devices, the method comprising: receiving a word input vector at an input layer of a neural network, the word input vector representing an individual word from an input sequence of words; receiving a topic feature vector at the input layer of the neural network, the topic feature vector being separate from the word input vector and representing topics expressed in the input sequence of words; using the neural network to generate an output vector at an output layer of the neural network based at least on the word input vector and the topic feature vector, wherein using the neural network includes, by a hidden layer of the neural network: modifying the word input vector using a first learned matrix; and modifying the topic feature vector using a second learned matrix that is separate from the first learned matrix, wherein the output vector represents a word probability given the word input vector and the topic feature vector; and performing a natural language processing operation based at least on the word probability represented by the output vector. 2. The method of claim 1 , wherein using the neural network includes: by the hidden layer of the neural network, modifying a time-delayed hidden-state vector with a third learned matrix, wherein the time-delayed hidden-state vector represents an output of the hidden layer in a prior time instance, wherein the word input vector, the topic feature vector, and the time-delayed hidden-state vector are separate vectors, and wherein the first learned matrix, the second learned matrix, and the third learned matrix are separate matrices. 3. The method of claim 2 , wherein using the neural network includes, by the output layer of the neural network: modifying the output of the hidden layer with a fourth learned matrix; and modifying the topic feature vector with a fifth learned matrix, wherein the first learned matrix, the second learned matrix, the third learned matrix, the fourth learned matrix, and the fifth learned matrix are separate matrices. 4. The method of claim 2 , wherein using the neural network includes, by the hidden layer: performing a first multiplication operation of the word input vector by the first learned matrix to generate a first multiplication output; performing a second multiplication operation of the topic feature vector by the second learned matrix to generate a second multiplication output; performing a third multiplication operation of the time-delayed hidden-state vector by the third learned matrix to generate a third multiplication output; and summing the first multiplication output, the second multiplication output, and the third multiplication output to generate the output of the hidden layer. 5. The method of claim 1 , further comprising: generating the topic feature vector using a Latent Dirichlet Allocation (LDA) technique; and as subsequent words from the input sequence are processed using the neural network, incrementally generating next topic feature vectors based at least on previous feature vectors. 6. The method of claim 5 , wherein said incrementally generating comprises applying a decay factor to previous topic feature vectors for previous words that have already been processed by the neural network. 7. The method of claim 1 , wherein the input sequence of words is part of an input document. 8. A system comprising: at least one processing device; and at least one computer readable medium storing instructions which, when executed by the at least one processing device, cause the at least one processing device to: receive a word input vector at an input layer of a neural network, the word input vector representing an individual word from an input sequence of words; receive a topic feature vector at the input layer of the neural network, the topic feature vector being separate from the word input vector and representing topics expressed in the input sequence of words; use the neural network to generate an output vector at an output layer of the neural network based at least on the word input vector and the topic feature vector, wherein using the neural network includes, by a hidden layer of the neural network: modifying the word input vector using a first learned matrix; and modifying the topic feature vector using a second learned matrix that is separate from the first learned matrix, wherein the output vector represents a word probability given the word input vector and the topic feature vector; and perform a natural language processing operation based at least on the word probability represented by the output vector. 9. The system of claim 8 , wherein the instructions, when executed by the at least one processing device, cause the at least one processing device to: by the hidden layer of the neural network, modify a time-delayed hidden-state vector with a third learned matrix, wherein the time-delayed hidden-state vector represents an output of the hidden layer in a prior time instance, wherein the word input vector, the topic feature vector, and the time-delayed hidden-state vector are separate vectors, and wherein the first learned matrix, the second learned matrix, and the third learned matrix are separate matrices. 10. The system of claim 9 , wherein the instructions, when executed by the at least one processing device, cause the at least one processing device to: by the output layer of the neural network: modify the output of the hidden layer with a fourth learned matrix; and modify the topic feature vector with a fifth learned matrix, wherein the first learned matrix, the second learned matrix, the third learned matrix, the fourth learned matrix, and the fifth learned matrix are separate matrices. 11. The system of claim 9 , wherein the instructions, when executed by the at least one processing device, cause the at least one processing device to: by the hidden layer of the neural network: perform a first multiplication operation of the word input vector by the first learned matrix to generate a first multiplication output; perform a second multiplication operation of the topic feature vector by the second learned matrix to generate a second multiplication output; perform a third multiplication operation of the time-delayed hidden-state vector by the third learned matrix to generate a third multiplication output; and sum the first multiplication output, the second multiplication output, and the third multiplication output to generate the output of the hidden layer. 12. The system of claim 8 , wherein the instructions, when executed by the at least one processing device, cause the at least one processing device to: generate the topic feature vector using a Latent Dirichlet Allocation (LDA) technique; and as subsequent words from the input sequence are processed using the neural network, incrementally generate next topic feature vectors based at least on previous feature vectors. 13. The system of claim 12 , wherein the instructions, when executed by the at least one processing device, cause the at least one processing device to: apply a decay factor to previous topic feature vectors for previous words that have already been processed by the neural network. 14. The system of claim 8 , wherein the input sequence of words is part of an input document. 15. At least one computer readable storage medium storing instructions which, when executed by at least one processing device, cause the at least one processing device to perform acts comprising: receiving a word input vector at an input layer of a n
Learning methods · CPC title
Statistical methods, e.g. probability models · CPC title
Architecture, e.g. interconnection topology · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Supervised learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.