Building multimodal collaborative dialogs with task frames
US-2016328270-A1 · Nov 10, 2016 · US
US2016154792A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016154792-A1 |
| Application number | US-201414556874-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 1, 2014 |
| Priority date | Dec 1, 2014 |
| Publication date | Jun 2, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and systems are provided for contextual language understanding. A natural language expression may be received at a single-turn model and a multi-turn model for determining an intent of a user. For example, the single-turn model may determine a first prediction of at least one of a domain classification, intent classification, and slot type of the natural language expression. The multi-turn model may determine a second prediction of at least one of a domain classification, intent classification, and slot type of the natural language expression. The first prediction and the second prediction may be combined to produce a final prediction relative to the intent of the natural language expression. An action may be performed based on the final prediction of the natural language expression.
Opening claim text (preview).
What is claimed is: 1 . A system comprising: at least one processor; and memory encoding computer executable instructions that, when executed by at least one processor, perform a method for contextual language understanding comprising: receiving at least a first natural language expression and a second natural language expression, wherein the first natural language expression and the second natural language expression include at least one of words, terms, and phrases; determining a first prediction of at least one of a domain classification, intent classification, and slot type of the first natural language expression; determining a second prediction of at least one of a domain classification, intent classification, and slot type of the second natural language expression using at least one of the first natural language expression and contextual information; and performing an action based on the second prediction of the second natural language expression. 2 . The system of claim 1 , wherein the first prediction and the second prediction are determined using a single model. 3 . The system of claim 1 , wherein the first prediction is determined using a single-turn model, and wherein the second prediction is determined using a multi-turn model. 4 . The system of claim 3 , further comprising combining the first prediction and the second prediction to produce a final prediction relative to an intent of the second natural language expression. 5 . The system of claim 1 , wherein the first natural language expression and the second natural language expression are at least one of a spoken language input and a textual input. 6 . The system of claim 2 , wherein determining the first prediction comprises evaluating the first natural language expression is isolation. 7 . The system of claim 6 , wherein evaluating the first natural language expression in isolation comprises at least: classifying the first natural language expression into a supported domain of the single model; classifying the first natural language expression into a supported intent of the single model; and extracting at least one semantic word from the first natural language expression and filling at least one supported slot type of the turn model with the at least one semantic word. 8 . The system of claim 2 , wherein evaluating the second natural language expression using contextual information comprises at least: classifying the second natural language expression into a supported domain of the single model using contextual information; classifying the second natural language expression into a supported intent of the single model using contextual information; and extracting at least one semantic word from the second natural language expression and filling at least one supported slot type of the multi-turn model with the at least one semantic word using contextual information. 9 . The system of claim 1 , wherein the contextual information includes at least one of information extracted from the first received natural language expression, a response to the first received natural language expression, client context, and knowledge content. 10 . The system of claim 1 , wherein determining a first prediction comprises calculating a first score indicative of a probability of the first prediction being correct. 11 . The system of claim 10 , wherein determining a second prediction comprises calculating a second score indicative of a probability of the second prediction being correct. 12 . The system of claim 4 , wherein combining the first prediction and the second prediction to produce a final prediction comprises: assigning a first weight to the single-turn model; assigning a second weight to the multi-turn model; and combining the first score and the second score utilizing the first assigned weight and the second assigned weight. 13 . A system comprising: a statistical model for receiving at least a first natural language expression and a second natural language expression during a conversational session, wherein the at least first and second natural language expressions include at least one of words, terms, and phrases; a single-turn model for determining a first prediction of at least one of a domain classification, intent classification, and slot type of each of the at least first and second natural language expressions; a multi-turn model for determining a second prediction of at least one of a domain classification, intent classification, and slot type of each of the at least first and second natural language expressions; a combination model for combining the first prediction and the second prediction of each of the at least first and second natural language expressions to produce a final prediction relative to an intent of at least the second natural language expression; and a final model for performing an action based on the final prediction of at least the second natural language expression. 14 . The system of claim 13 , wherein performing an action based on the final prediction comprises responding to the second natural language expression. 15 . The system of claim 14 , wherein responding to the second natural language expression includes an answer to the second natural language expression based on the final prediction of at least the second natural language expression. 16 . The system of claim 14 , wherein responding to the second natural language expression includes at least one of asking a question and performing a task. 17 . The system of claim 13 , wherein determining a first prediction for the at least first and second natural language expressions comprises evaluating the first and second natural language expressions in isolation. 18 . The system of claim 13 , wherein determining a second prediction for the at least first and second natural language expressions comprises evaluating the first and second natural language expressions using contextual information. 19 . The system of claim 18 , wherein evaluating the second natural language expression using contextual information comprises evaluating a combination of the first natural language expression, the first prediction for the at least first and second natural language expressions, client context, and knowledge content. 20 . One or more computer-readable storage media, having computer-executable instructions that, when executed by at least one processor, perform a method for building a statistical model for contextual language understanding, the method comprising: receiving a first natural language expression, wherein the first natural language expression includes at least one of words, terms, and phrases; performing a first action based on a first prediction determined by a single-turn model and a second prediction determined by a multi-turn model; receiving a second natural language expression, wherein the second natural language expression includes at least one of words, terms, and phrases; evaluating at least the first natural language expression, the first action, the first prediction, the second prediction, and the second natural language expression to generate contextual information; aggregating the contextual information into the multi-turn model; and performing a second action based on evaluating at least the first natural language expression, the first action, the first prediction, the second prediction, and the second natural language expression.
Discourse or dialogue representation · CPC title
Semantic analysis · CPC title
Natural language query formulation or dialogue systems · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.