Deep Neural Network-Based Decision Network
US-2018268298-A1 · Sep 20, 2018 · US
US11409945B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11409945-B2 |
| Application number | US-202017027130-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 21, 2020 |
| Priority date | May 19, 2017 |
| Publication date | Aug 9, 2022 |
| Grant date | Aug 9, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system is provided for natural language processing. In some embodiments, the system includes an encoder for generating context-specific word vectors for at least one input sequence of words. The encoder is pre-trained using training data for performing a first natural language processing task. A neural network performs a second natural language processing task on the at least one input sequence of words using the context-specific word vectors. The first natural language process task is different from the second natural language processing task and the neural network is separately trained from the encoder. In some embodiments, the first natural processing task can be machine translation, and the second natural processing task can be one of sentiment analysis, question classification, entailment classification, and question answering.
Opening claim text (preview).
What is claimed is: 1. A system comprising: a neural network for performing a first natural language processing task, the neural network comprising: a plurality of activation function units adapted to respectively apply a feedforward network with a rectified linear function on a plurality of input vector sequences to produce respective activation sequences; a plurality of encoders adapted to receive the activation sequences and respectively generate a first task specific representation relating to a first input vector sequence of the plurality of input vector sequences and a second task specific representation relating to a second input vector sequence of the plurality of input vector sequences; and a biattention mechanism adapted to generate conditioning information that indicates interdependent representations relating to the first task specific representation and the second task specific representation. 2. The system of claim 1 , wherein the first input vector sequence comprises a first concatenated word vector and its corresponding context-specific word vector generated from a first word sequence and the second input vector sequence comprises a second concatenated word vector and its corresponding context-specific word vector generated from a second word sequence. 3. The system of claim 1 , wherein the first natural language processing task performed by the neural network is one of sentiment classification and entailment classification. 4. The system of claim 1 , wherein the neural network is trained using a dataset for one of sentiment analysis, question classification, entailment classification, and question answering. 5. The system of claim 2 , further comprising an encoder that is pre-trained on a second natural language processing task, the encoder capable of generating a context-specific word vector for one of the first and second word sequences, the context-specific word vector forming at least a part of one of the first and second input vector sequences. 6. The system of claim 5 , wherein the second natural language processing task is machine-translation. 7. The system of claim 5 , wherein the first natural language processing task is different from the second natural language processing task. 8. The system of claim 1 , wherein adapting by the biattention mechanism comprises: computing an affinity matrix based on the first and second task specific representations; extracting, based on the affinity matrix, a first attention weight relating to the first task specific representation and a second attention weight relating to the second task specific representation; and generating, based on the first and second attention weights, a first context summary and a second context summary to condition the first and second task specific representations. 9. The system of claim 8 , wherein the neural network further comprises: a first integrator capable of integrating the first context summary to generate a first integrated output; and a second integrator capable of integrating the second context summary to generate a second integrated output. 10. The system of claim 9 , wherein the neural network further comprises: a first pool mechanism capable of aggregating the first integrated output to generate a first pooled representation relating to the first task specific representation; and a second pool mechanism capable of aggregating the second integrated output to generate a second pooled representation relating to the second task specific representation. 11. The system of claim 10 , wherein the neural network further comprises a maxout layer capable of combining the first and second pooled representations to generate a result for the first natural language processing task. 12. A method for performing a first natural language processing task, the method comprising: executing a plurality of activation function units adapted to respectively apply a feedforward network with a rectified linear function on a plurality of input vector sequences to produce respective activation sequences; generating, based on the execution of the plurality of activation function units on the plurality of input vector sequences, a first task specific representation relating to a first input vector sequence of the plurality of input vector sequences and a second task specific representation relating to a second input vector sequence of the plurality of input vector sequences; and computing, based on the first and second task specific representations, an interdependent representation related to the first task specific representation and the second task specific representation. 13. The method of claim 12 , wherein the first input vector sequence comprises a first concatenated word vector and its corresponding context-specific word vector generated from a first word sequence and the second input vector sequence comprises a second concatenated word vector and its corresponding context-specific word vector generated from a second word sequence. 14. The method of claim 12 , wherein the first natural language processing task is one of sentiment classification and entailment classification. 15. The method of claim 13 , further comprising generating, using an encoder that is pre-trained on a second natural language processing task, a context-specific word vector for one of the first and second word sequences, the context-specific word vector forming at least a part of one of the first and second input vector sequences. 16. The method of claim 15 , wherein the first natural language processing task is different from the second natural language processing task. 17. The method of claim 12 , wherein computing the interdependent representation comprises: computing an affinity matrix based on the first and second task specific representations; extracting, based on the affinity matrix, a first attention weight relating to the first task specific representation and a second attention weight relating to the second task specific representation; and generating, based on the first and second attention weights, a first context summary and a second context summary to condition the first and second task specific representations. 18. The method of claim 12 , further comprising: integrating a first context summary to generate a first integrated output; and integrating a second context summary to generate a second integrated output. 19. The method of claim 18 , further comprising: aggregating the first integrated output to generate a first pooled representation relating to the first task specific representation; and aggregating the second integrated output to generate a second pooled representation relating to the second task specific representation. 20. The method of claim 19 , further comprising: combining the first and second pooled representations to generate a result for the first natural language processing task.
Combinations of networks · CPC title
Recurrent networks, e.g. Hopfield networks · CPC title
Transfer learning · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.