Interface for a virtual digital assistant
US-2017161018-A1 · Jun 8, 2017 · US
US9858925B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9858925-B2 |
| Application number | US-201113250854-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 30, 2011 |
| Priority date | Jun 5, 2009 |
| Publication date | Jan 2, 2018 |
| Grant date | Jan 2, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A virtual assistant uses context information to supplement natural language or gestural input from a user. Context helps to clarify the user's intent and to reduce the number of candidate interpretations of the user's input, and reduces the need for the user to provide excessive clarification input. Context can include any available information that is usable by the assistant to supplement explicit user input to constrain an information-processing problem and/or to personalize results. Context can be used to constrain solutions during various phases of processing, including, for example, speech recognition, natural language processing, task flow processing, and dialog generation.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for disambiguating user input to perform a task on a computing device having at least one processor, comprising: at an output device, prompting a user for input; at an input device, receiving spoken user input; at a processor communicatively coupled to the output device and to the input device, receiving context information from a context source; at the processor, generating a first plurality of candidate interpretations of the received spoken user input; at the processor, disambiguating the intent of a word in the first plurality of candidate interpretations based on the context information to generate a second plurality of candidate interpretations, wherein the second plurality of candidate interpretations is a subset of the first plurality of candidate interpretations; at the processor, sorting the second plurality of candidate interpretations by relevance based on the context information; at the processor, deriving a representation of user intent based on the sorted second plurality of candidate interpretations; at the processor, identifying at least one task and at least one parameter for the task, based at least in part on the derived representation of user intent; at the processor, executing the at least one task using the at least one parameter, to derive a result; at the processor, generating a dialog response based on the derived result; and at the output device, outputting the generated dialog response. 2. The method of claim 1 , wherein: prompting the user comprises prompting the user via a conversational interface; and receiving the spoken user input comprises: receiving the spoken user input via the conversational interface; and converting the spoken user input to a text representation. 3. The method of claim 2 , wherein converting the spoken user input to a text representation comprises: generating a plurality of candidate text interpretations of the spoken user input; and ranking at least a subset of the generated candidate text interpretations; wherein at least one of the generating and ranking steps is performed using the received context information. 4. The method of claim 3 , wherein the received context information used in at least one of the generating and ranking steps comprises at least one selected from the group consisting of: data describing an acoustic environment in which the spoken user input is received; data received from at least one sensor; vocabulary obtained from a database associated with the user; vocabulary associated with application preferences; vocabulary obtained from usage history; and current dialog state. 5. The method of claim 1 , wherein prompting the user comprises generating at least one prompt based at least in part on the received context information. 6. The method of claim 1 , wherein disambiguating the received spoken user input based on the context information to derive a representation of user intent comprises performing natural language processing on the received spoken user input based at least in part on the received context information. 7. The method of claim 6 , wherein the received context information used in disambiguating the received spoken user input comprises at least one selected from the group consisting of: data describing an event; application context; input previously provided by the user; known information about the user; location; date; environmental conditions; and history. 8. The method of claim 1 , wherein performing natural language processing comprises selecting among a plurality of candidate interpretations of the received spoken user input using the received context information. 9. The method of claim 1 , wherein performing natural language processing comprises determining a referent for at least one pronoun in the received spoken user input. 10. The method of claim 1 , wherein identifying at least one task and at least one parameter for the task comprises identifying at least one task and at least one parameter for the task based at least in part on the received context information. 11. The method of claim 10 , wherein identifying at least one task and at least one parameter for the task based at least in part on the received context information comprises: receiving a plurality of candidate representations of user intent; determining a preferred interpretation of user intent based on at least one selected from the group consisting of: at least one domain model; at least one task flow model; and at least one dialog flow model. 12. The method of claim 10 , wherein the received context information used in identifying at least one task and at least one parameter for the task comprises at least one selected from the group consisting of: data describing an event; data from a database associated with the user; data received from at least one sensor; application context; input previously provided by the user; known information about the user; location; date; environmental conditions; and history. 13. The method of claim 1 , wherein generating a dialog response comprises generating a dialog response based at least in part on the received context information. 14. The method of claim 13 , wherein generating a dialog response based at least in part on the received context information comprises at least one selected from the group consisting of: generating a dialog response including a named referent; generating a dialog response including a symbolic name associated with a telephone number; determining which of a plurality of names to use for a referent; determining a level of detail for the generated response; and filtering a response based on previous output. 15. The method of claim 13 , wherein the received context information used in generating a dialog response comprises at least one selected from the group consisting of: data from a database associated with the user; application context; input previously provided by the user; known information about the user; location; date; environmental conditions; and history. 16. The method of claim 1 , wherein the received context information comprises at least one selected from the group consisting of: context information stored at a server; and context information stored at a client. 17. The method of claim 1 , wherein receiving context information from a context source comprises: requesting the context information from a context source; and receiving the context information in response to the request. 18. The method of claim 1 , wherein receiving context information from a context source comprises: receiving at least a portion of the context information prior to receiving the spoken user input. 19. The method of claim 1 , wherein receiving context information from a context source comprises: receiving at least a portion of the context information after receiving the spoken user input. 20. The method of claim 1 , wherein receiving context information from a context source comprises: receiving static context information as part of an initialization step; and receiving additional context information after receiving the spoken user input. 21. The method of claim 1 , wherein receiving context information from a context source comprises: receiving push notification of a change in context information; and responsive to the push notification, updating locally stored context information. 22.
using context dependencies, e.g. language models · CPC title
Multimodal input, i.e. interface arrangements enabling the user to issue commands by simultaneous use of input devices of different nature, e.g. voice plus gesture on digitizer · CPC title
using natural language modelling · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.