Natural language processing using context-specific word vectors

US11409945B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11409945-B2
Application numberUS-202017027130-A
CountryUS
Kind codeB2
Filing dateSep 21, 2020
Priority dateMay 19, 2017
Publication dateAug 9, 2022
Grant dateAug 9, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system is provided for natural language processing. In some embodiments, the system includes an encoder for generating context-specific word vectors for at least one input sequence of words. The encoder is pre-trained using training data for performing a first natural language processing task. A neural network performs a second natural language processing task on the at least one input sequence of words using the context-specific word vectors. The first natural language process task is different from the second natural language processing task and the neural network is separately trained from the encoder. In some embodiments, the first natural processing task can be machine translation, and the second natural processing task can be one of sentiment analysis, question classification, entailment classification, and question answering.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a neural network for performing a first natural language processing task, the neural network comprising: a plurality of activation function units adapted to respectively apply a feedforward network with a rectified linear function on a plurality of input vector sequences to produce respective activation sequences; a plurality of encoders adapted to receive the activation sequences and respectively generate a first task specific representation relating to a first input vector sequence of the plurality of input vector sequences and a second task specific representation relating to a second input vector sequence of the plurality of input vector sequences; and a biattention mechanism adapted to generate conditioning information that indicates interdependent representations relating to the first task specific representation and the second task specific representation. 2. The system of claim 1 , wherein the first input vector sequence comprises a first concatenated word vector and its corresponding context-specific word vector generated from a first word sequence and the second input vector sequence comprises a second concatenated word vector and its corresponding context-specific word vector generated from a second word sequence. 3. The system of claim 1 , wherein the first natural language processing task performed by the neural network is one of sentiment classification and entailment classification. 4. The system of claim 1 , wherein the neural network is trained using a dataset for one of sentiment analysis, question classification, entailment classification, and question answering. 5. The system of claim 2 , further comprising an encoder that is pre-trained on a second natural language processing task, the encoder capable of generating a context-specific word vector for one of the first and second word sequences, the context-specific word vector forming at least a part of one of the first and second input vector sequences. 6. The system of claim 5 , wherein the second natural language processing task is machine-translation. 7. The system of claim 5 , wherein the first natural language processing task is different from the second natural language processing task. 8. The system of claim 1 , wherein adapting by the biattention mechanism comprises: computing an affinity matrix based on the first and second task specific representations; extracting, based on the affinity matrix, a first attention weight relating to the first task specific representation and a second attention weight relating to the second task specific representation; and generating, based on the first and second attention weights, a first context summary and a second context summary to condition the first and second task specific representations. 9. The system of claim 8 , wherein the neural network further comprises: a first integrator capable of integrating the first context summary to generate a first integrated output; and a second integrator capable of integrating the second context summary to generate a second integrated output. 10. The system of claim 9 , wherein the neural network further comprises: a first pool mechanism capable of aggregating the first integrated output to generate a first pooled representation relating to the first task specific representation; and a second pool mechanism capable of aggregating the second integrated output to generate a second pooled representation relating to the second task specific representation. 11. The system of claim 10 , wherein the neural network further comprises a maxout layer capable of combining the first and second pooled representations to generate a result for the first natural language processing task. 12. A method for performing a first natural language processing task, the method comprising: executing a plurality of activation function units adapted to respectively apply a feedforward network with a rectified linear function on a plurality of input vector sequences to produce respective activation sequences; generating, based on the execution of the plurality of activation function units on the plurality of input vector sequences, a first task specific representation relating to a first input vector sequence of the plurality of input vector sequences and a second task specific representation relating to a second input vector sequence of the plurality of input vector sequences; and computing, based on the first and second task specific representations, an interdependent representation related to the first task specific representation and the second task specific representation. 13. The method of claim 12 , wherein the first input vector sequence comprises a first concatenated word vector and its corresponding context-specific word vector generated from a first word sequence and the second input vector sequence comprises a second concatenated word vector and its corresponding context-specific word vector generated from a second word sequence. 14. The method of claim 12 , wherein the first natural language processing task is one of sentiment classification and entailment classification. 15. The method of claim 13 , further comprising generating, using an encoder that is pre-trained on a second natural language processing task, a context-specific word vector for one of the first and second word sequences, the context-specific word vector forming at least a part of one of the first and second input vector sequences. 16. The method of claim 15 , wherein the first natural language processing task is different from the second natural language processing task. 17. The method of claim 12 , wherein computing the interdependent representation comprises: computing an affinity matrix based on the first and second task specific representations; extracting, based on the affinity matrix, a first attention weight relating to the first task specific representation and a second attention weight relating to the second task specific representation; and generating, based on the first and second attention weights, a first context summary and a second context summary to condition the first and second task specific representations. 18. The method of claim 12 , further comprising: integrating a first context summary to generate a first integrated output; and integrating a second context summary to generate a second integrated output. 19. The method of claim 18 , further comprising: aggregating the first integrated output to generate a first pooled representation relating to the first task specific representation; and aggregating the second integrated output to generate a second pooled representation relating to the second task specific representation. 20. The method of claim 19 , further comprising: combining the first and second pooled representations to generate a result for the first natural language processing task.

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Transfer learning · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11409945B2 cover?
A system is provided for natural language processing. In some embodiments, the system includes an encoder for generating context-specific word vectors for at least one input sequence of words. The encoder is pre-trained using training data for performing a first natural language processing task. A neural network performs a second natural language processing task on the at least one input sequen…
Who is the assignee on this patent?
Salesforce Com Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 09 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).