Dynamic coattention network for question answering
US-2018129938-A1 · May 10, 2018 · US
US11657233B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11657233-B2 |
| Application number | US-202217673709-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 16, 2022 |
| Priority date | Apr 18, 2019 |
| Publication date | May 23, 2023 |
| Grant date | May 23, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for unifying question answering and text classification via span extraction include a preprocessor for preparing a source text and an auxiliary text based on a task type of a natural language processing task, an encoder for receiving the source text and the auxiliary text from the preprocessor and generating an encoded representation of a combination of the source text and the auxiliary text, and a span-extractive decoder for receiving the encoded representation and identifying a span of text within the source text that is a result of the NLP task. The task type is one of entailment, classification, or regression. In some embodiments, the source text includes one or more of text received as input when the task type is entailment, a list of classifications when the task type is entailment or classification, or a list of similarity options when the task type is regression.
Opening claim text (preview).
What is claimed is: 1. A system for performing a natural language processing (NLP) task comprising: a communication interface receiving an input text for an NLP task; a memory storing a plurality of processor-executable instructions; and a processor executing the instructions to: prepare a source text and an auxiliary text from the input text by appending one or more option outputs of the NLP task to the source text based on a task type of the NLP task, wherein the preparing comprises including a list of similarity options in the source text when the task type is regression; concatenating the appended source text including the one or more option outputs and the auxiliary text into a vector input; encoding, via an encoder, the vector input into an encoded representation; and identifying, by a span-extractive decoder from the encoded representation, a span of text within the appended source text including the one or more option outputs as a result of the NLP task. 2. The system of claim 1 , wherein the encoder is a multi-layer attention-based encoder. 3. The system of claim 1 , wherein the span-extractive decoder comprises: a first softmax for combining a trainable parameter vector associated with start token positions of the span of text and a portion of the encoded representation corresponding to the source text and generating a distribution of possible start tokens for the span of text; a first argument maximum module for selecting a start token for the span of text based on the distribution of possible start tokens for the span of text; a second softmax for combining a trainable parameter vector associated with end token positions of the span of text and the portion of the encoded representation corresponding to the source text and generating a distribution of possible end tokens for the span of text; and a second argument maximum module for selecting an end token for the span of text based on the distribution of possible end tokens for the span of text. 4. The system of claim 1 , wherein the span of text is identified by: combining a trainable parameter vector associated with start token positions of the span of text and a portion of the encoded representation corresponding to the first text string to generate a distribution of possible start tokens for the span of text; selecting a start token for the span of text based on the distribution of possible start tokens for the span of text; combining a trainable parameter vector associated with end token positions of the span of text and the portion of the encoded representation corresponding to the first text string and generating a distribution of possible end tokens for the span of text; and selecting an end token for the span of text based on the distribution of possible end tokens for the span of text. 5. The system of claim 4 , wherein the processor further executes instructions to use another of the one or more text inputs as part of the source text when the task type is entailment or regression. 6. The system of claim 1 , wherein the processor further executes instructions to include a list of classifications in the source text when the task type is entailment or classification. 7. The system of claim 6 , wherein the list of classifications is included in one of the one or more text inputs. 8. The system of claim 6 , wherein the list of classifications is looked-up based on the task type. 9. The system of claim 1 , wherein the processor further executes instructions to generate an embedding for the combination of the start text and the auxiliary text, the embedding including information as to whether an embedded token corresponds to a token in the start text or a token in the auxiliary text. 10. A method for performing a natural language processing (NLP) task comprising: receiving, via a communication interface, an input text for an NLP task; preparing, by a processor, a source text and an auxiliary text from the input text by appending one or more option outputs of the NLP task to the source text based on a task type of the NLP task, wherein the preparing comprises including a list of similarity options in the source text when the task type is regression; concatenating, by the processor, the appended source text including the one or more option outputs and the auxiliary text into a vector input; encoding, via an encoder, the vector input into an encoded representation; and identifying, by a span-extractive decoder from the encoded representation, a span of text within the appended source text including the one or more option outputs as a result of the NLP task. 11. The method of claim 10 , wherein generating the encoded representation comprises using a plurality of attention-based encoding layers. 12. The method of claim 10 , wherein identifying the span of text comprises: combining a trainable parameter vector associated with start token positions of the span of text and a portion of the encoded representation corresponding to the source text to generate a distribution of possible start tokens for the span of text; selecting a start token for the span of text based on the distribution of possible start tokens for the span of text; combining a trainable parameter vector associated with end token positions of the span of text and the portion of the encoded representation corresponding to the source text and generating a distribution of possible end tokens for the span of text; and selecting an end token for the span of text based on the distribution of possible end tokens for the span of text. 13. The method of claim 10 , wherein the auxiliary text is prepared by receiving the auxiliary text as an input. 14. The method of claim 10 , wherein preparing the source text comprises receiving a portion of the source text as an input when the task type is entailment or regression. 15. The method of claim 10 , wherein preparing the source text comprises including a list of classifications in the source text when the task type is entailment or classification, the list of classifications being received as an input or being looked-up based on the task type. 16. The method of claim 10 , further comprising generating an embedding for the combination of the start text and the auxiliary text, the embedding including information as to whether an embedded token corresponds to a token in the start text or a token in the auxiliary text. 17. A non-transitory machine-readable medium comprising executable code which when executed by one or more processors associated with a computing device are adapted to cause the one or more processors to perform a method comprising: receiving, via a communication interface, an input text for an NLP task; preparing, by a processor, a source text and an auxiliary text from the input text by appending one or more option outputs of the NLP task to the source text based on a task type of the NLP task, wherein the preparing comprises including a list of similarity options in the source text when the task type is regression; concatenating, by the processor, the appended source text including the one or more option outputs and the auxiliary text into a vector input; encoding, via an encoder, the vector input into an encoded representation; and identifying, by a span-extractive decoder from the encoded representation, a span of text within the appended source text including the one or more option outputs as a result of the NLP task. 18. The non-transitory machine-readable medium of claim 17 , wherein the span of text is identified by: combining a trainable para
Transfer learning · CPC title
Feedforward networks · CPC title
Supervised learning · CPC title
Learning methods · CPC title
Lexical analysis, e.g. tokenisation or collocates · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.