Multilevel speech recognition method and apparatus
US-2016012820-A1 · Jan 14, 2016 · US
US9607102B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9607102-B2 |
| Application number | US-201414478121-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 5, 2014 |
| Priority date | Sep 5, 2014 |
| Publication date | Mar 28, 2017 |
| Grant date | Mar 28, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed methods and systems are directed to task switching in dialog processing. The methods and systems may include activating a primary task, receiving, one or more ambiguous natural language commands, and identifying a first candidate task for each of the one or more ambiguous natural language commands. The methods and system may also include identifying, for each of the one or more ambiguous natural language commands and based on one or more rules, a second candidate task of the plurality of tasks corresponding to the ambiguous natural language command, determining whether to modify at least one of the one or more rules-based task switching rules based on whether a quality metric satisfies a threshold quantity, and when the second quality metric satisfies the threshold quantity, changing the task switching rule for the corresponding candidate task from a rules-based model to the optimized statistical based task switching model.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving, by a computing device comprising a natural language understanding automatic speech recognition computing engine device, a first natural language input comprising one or more words; activating, by the natural language understanding automatic speech recognition computing engine device, a first task based on the first natural language input, wherein the first task is associated with one or more first task agents configured for retrieving information associated with the first task; prompting, by the natural language understanding automatic speech recognition computing engine device and via a first of the one or more first task agents, for a subsequent natural language input based on a first transcription of the first natural language input and based on a first intent associated with the first task; receiving, by the natural language understanding automatic speech recognition computing engine device and while the first task is activated, a second natural language input comprising one or more words; responsive to determining, by the natural language understanding automatic speech recognition computing engine device, that a task activation switching parameter associated with the first task is not a false value and that a second intent associated with the second natural language input is different from the first intent associated with the first task, determining, by the natural language understanding automatic speech recognition computing engine device: one or more candidate second tasks that are capable of being activated based on one or more task switching rules that identify one or more tasks that are allowed to interrupt the activated first task, wherein an interrupted task is arranged in a task stack memory component of the computing device; and one or more candidate third tasks that are incapable of being activated based on the one or more task switching rules; activating, by the natural language understanding automatic speech recognition computing engine device, one of the one or more candidate second tasks, wherein the activated candidate second task is associated with one or more second task agents configured for retrieving information associated with the activated candidate second task; responsive to satisfying the one or more second task agents with the first natural language input, the second natural language input, or an additional natural language input, performing, by the computing device, an action associated with the activated second candidate task; determining, by the natural language understanding automatic speech recognition computing engine device, whether the second natural language input satisfies the one or more first task agents associated with the first task; performing, by the computing device, an action associated with the first task responsive to the second natural language input satisfying the one or more first task agents; and prompting, by the natural language understanding automatic speech recognition computing engine device and via one of the one or more first task agents, for a second subsequent natural language input based on the first transcription of the first natural language input responsive to the second natural language input not satisfying the one or more first task agents. 2. The method of claim 1 , further comprising: responsive to activating the one candidate second task, pausing the first task at a first point; resuming the first task at the first point responsive to the second input not satisfying the one or more first task agents; prompting, via the first of the one or more first task agents, for the second subsequent natural language input; and performing the action associated with the first task responsive to receiving at least one additional natural language input that satisfies the one or more first task agents. 3. The method of claim 1 , wherein the activating the one of the one or more candidate second tasks comprises: determining a confidence score for each of the one or more candidate second tasks via a statistical-based task switching model; identifying one of the one or more scored candidate second tasks that is associated with a high score; and activating the identified scored candidate second task associated with the high score. 4. The method of claim 1 , further comprising: determining a confidence score for each of the one or more candidate second tasks and for each of the one or more third candidate tasks via a statistical-based task switching model; and responsive to identifying a scored candidate third task associated with a high score compared to the scored one or more second candidate tasks, modifying at least one of the one or more task switching rules, wherein the modification is based on the scored candidate third task associated with the high score. 5. The method of claim 4 , wherein the modifying the at least one of the or more task switching rules comprises modifying a task switching rule associated with the scored candidate third task associated with the high score, wherein the determining the one or more candidate second tasks that are capable of being activated based on the one or more task switching rules comprises determining that the scored candidate third task associated with the high score is capable of being activated based on the modified task switching rule; and wherein the activating the one of the one or more candidate second tasks comprises activating the scored candidate third task associated with the high score. 6. The method of claim 1 , wherein the satisfying the one or more second task agents comprises: prompting, via the one or more second task agents, for an additional natural language input associated with the second task; receiving at least one natural language input associated with the second task; and satisfying the one or more second task agents with a transcription of the received at least one natural language input associated with the second task. 7. The method of claim 1 , further comprising: receiving, while the first task and the activated candidate second task are activated, a third natural language input comprising one or more words; responsive to determining that a third intent associated with the third natural language input is different from the first intent associated with the first task and from the second intent associated with the second natural language input, determining one or more candidate fourth tasks that are capable of being activated based on one or more second task switching rules that identify one or more tasks that are allowed to interrupt the activated first task and the activated candidate second task; activating one of the one or more candidate fourth tasks, wherein the activated candidate fourth task is associated with one or more fourth task agents; and responsive to satisfying the one or more fourth task agents, performing an action associated with the activated fourth candidate task. 8. The method of claim 7 , wherein satisfying the one or more second task agents comprises satisfying the one or more second task agents with a transcription of the third natural language input. 9. A system, comprising: at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the system to perform: receiving, by a computing device comprising a natural language understanding automatic speech recognition computing engine device, a first natural language input comprising one or more words; activating, by the natural language understanding automatic speech recognition computing engine device, a first task based on the first natural language input, wherein the first task is ass
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
Physics · mapped topic
Speech recognition (G10L17/00 takes precedence) · CPC title
Parsing for meaning understanding · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.