Server side hotwording
US-2024412734-A1 · Dec 12, 2024 · US
US11488582B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11488582-B2 |
| Application number | US-202016889672-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 1, 2020 |
| Priority date | Nov 26, 2008 |
| Publication date | Nov 1, 2022 |
| Grant date | Nov 1, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed herein are systems, computer-implemented methods, and computer-readable media for dialog modeling. The method includes receiving spoken dialogs annotated to indicate dialog acts and task/subtask information, parsing the spoken dialogs with a hierarchical, parse-based dialog model which operates incrementally from left to right and which only analyzes a preceding dialog context to generate parsed spoken dialogs, and constructing a functional task structure of the parsed spoken dialogs. The method can further either interpret user utterances with the functional task structure of the parsed spoken dialogs or plan system responses to user utterances with the functional task structure of the parsed spoken dialogs. The parse-based dialog model can be a shift-reduce model, a start-complete model, or a connection path model.
Opening claim text (preview).
We claim: 1. A method of operating a spoken dialog system, the method comprising: parsing, via a processor of the spoken dialog system, a subtask structure of a spoken dialog with a selected parse-based dialog model, wherein the selected parse-based dialog model is generated according to operations comprising: training a plurality of hierarchical, parsed-based dialog models, wherein the plurality of hierarchical, parsed-based dialog models comprises at least one of: a shift-reduce model, a start-complete model, or a connection path model, and wherein: when the plurality of hierarchical, parsed-based dialog models comprises the shift-reduce model, the shift-reduce model has a first stack and a tree for performing one or more operations comprising (a) shifting each utterance onto the first stack, (b) inspecting the first stack, and (c) based on a stack inspection, performing a reduce action that creates subtrees in the tree; when the plurality of hierarchical, parsed-based dialog models comprises the start-complete model, the start-complete model uses a second stack to maintain a global parse state and produces a dialog task structure; when the plurality of hierarchical, parsed-based dialog models comprises the connection path model, the connection path model performs one or more operations comprising (a) predicting a connection path from a root to a terminal for each received spoken dialog, and (b) creating a parse tree representing the connection path for each received spoken dialog; and selecting a parse-based dialog model from the shift-reduce model, the start-complete model, or the connection path model, to yield the selected parse-based dialog model. 2. The method of claim 1 , further comprising: constructing a functional task structure of the spoken dialog parsed by the spoken dialog system. 3. The method of claim 2 , further comprising: predicting a likely next dialog act in the spoken dialog using the functional task structure and the selected parse-based dialog model, the likely next dialog act corresponding to a next utterance comprising a clause to be spoken by a speaker, wherein the predicting occurs prior to receiving the next utterance. 4. The method of claim 3 , further comprising: selecting a language model for the next utterance based on the likely next dialog act to yield a selected language model. 5. The method of claim 4 , further comprising measuring a dialog efficiency at different dialog stages based on the selected language model. 6. The method of claim 1 , wherein each of the plurality of hierarchical, parsed-based dialog models operates incrementally from left to right and only analyzes an immediately preceding dialog context. 7. The method of claim 1 , wherein, when the plurality of hierarchical, parsed-based dialog models comprises the connection path model, the connection path model does not use any stack to maintain a global parse state. 8. A spoken dialog system comprising: a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations, the operations comprising: parsing a subtask structure of a spoken dialog with a selected parse-based dialog model, wherein the selected parse-based dialog model is generated according to operations comprising: training a plurality of hierarchical, parsed-based dialog models, wherein the plurality of hierarchical, parsed-based dialog models comprises at least one of: a shift-reduce model, a start-complete model, or a connection path model, and wherein: when the plurality of hierarchical, parsed-based dialog models comprises the shift-reduce model, the shift-reduce model has a first stack and a tree for performing one or more operations comprising (a) shifting each utterance onto the first stack, (b) inspecting the first stack, and (c) based on a stack inspection, performing a reduce action that creates subtrees in the tree; when the plurality of hierarchical, parsed-based dialog models comprises the start-complete model, the start-complete model uses a second stack to maintain a global parse state and produces a dialog task structure; when the plurality of hierarchical, parsed-based dialog models comprises the connection path model, the connection path model performs one or more operations comprising (a) predicting a connection path from a root to a terminal for each received spoken dialog, and (b) creating a parse tree representing the connection path for each received spoken dialog; and selecting a parse-based dialog model from the shift-reduce model, the start-complete model, or the connection path model, to yield the selected parse-based dialog model. 9. The spoken dialog system of claim 8 , wherein the computer-readable storage medium stores additional instructions which, when executed by the processor, cause the processor to perform operations further comprising: constructing a functional task structure of the spoken dialog parsed by the spoken dialog system. 10. The spoken dialog system of claim 9 , wherein the computer-readable storage medium stores additional instructions which, when executed by the processor, cause the processor to perform operations further comprising: predicting a likely next dialog act in the spoken dialog using the functional task structure and the hierarchical, parsed-based dialog model, the likely next dialog act corresponding to a next utterance comprising a clause to be spoken by a speaker, wherein the predicting occurs prior to receiving the next utterance. 11. The spoken dialog system of claim 10 , wherein the computer-readable storage medium stores additional instructions which, when executed by the processor, cause the processor to perform operations further comprising: selecting a language model for the next utterance based on the likely next dialog act to yield a selected language model. 12. The spoken dialog system of claim 11 , wherein the computer-readable storage medium stores additional instructions which, when executed by the processor, cause the processor to perform operations further comprising: measuring a dialog efficiency at different dialog stages based on the selected language model. 13. The spoken dialog system of claim 8 , wherein each of the plurality of hierarchical, parsed-based dialog models operates incrementally from left to right and only analyzes an immediately preceding dialog context. 14. The spoken dialog system of claim 8 , wherein, when the plurality of hierarchical, parsed-based dialog models comprises the connection path model, the connection path model does not use any stack to maintain a global parse state. 15. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations, the operations comprising: parsing a subtask structure of a spoken dialog with a selected parse-based dialog model, wherein the selected parse-based dialog model is generated according to operations comprising: training a plurality of hierarchical, parsed-based dialog models, wherein the plurality of hierarchical, parsed-based dialog models comprises at least one of: a shift-reduce model, a start-complete model, or a connection path model, and wherein: when the plurality of hierarchical, parsed-based dialog models comprises the shift-reduce model, the shift-reduce model has a first stack and a tree for performing one or more operations comprising (a) shifting each utterance onto the first stack, (b) inspecting the first stack, and (c) based on a stack inspection, performing a reduce action that creates
Segmentation; Word boundary detection · CPC title
using natural language modelling · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Hierarchical processing, e.g. outlines · CPC title
Tree transformation for tree-structured or markup documents, e.g. XSLT, XSL-FO or stylesheets · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.