Audio triggers based on context
US-8938394-B1 · Jan 20, 2015 · US
US10769385B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10769385-B2 |
| Application number | US-201816204467-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 29, 2018 |
| Priority date | Jun 9, 2013 |
| Publication date | Sep 8, 2020 |
| Grant date | Sep 8, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A text string with a first and a second portion is provided. A domain of the text string is determined by applying a first word-matching process to the first portion of the text string. It is then determined whether the second portion of the text string matches a word of a set of words associated with the domain by applying a second word-matching process to the second portion of the text string. Upon determining that the second portion of the text string matches the word of the set of words, it is determined whether a user intent from the text string based at least in part on the domain and the word of the set of words.
Opening claim text (preview).
What is claimed is: 1. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by an electronic device with one or more processors and memory, cause the device to: receive audio input containing a user utterance; perform speech-to-text processing on the audio input to determine a plurality of text representations of the user utterance and a plurality of speech recognition scores for the plurality of text representations; perform natural language processing on each of the plurality of text representations to determine a plurality of candidate user intents and a plurality of intent deduction scores for the plurality of candidate user intents; determine a plurality of composite scores for the plurality of candidate user intents based on a combination of the plurality of speech recognition scores and the plurality of intent deduction scores; select a user intent from the plurality of candidate user intents based on the plurality of composite scores; and perform a task corresponding to the selected user intent. 2. The computer readable storage medium of claim 1 , wherein each intent deduction score of the plurality of intent deduction scores is based on a number of words in a respective text representation of the plurality of text representations that correspond to a domain of a respective candidate user intent of the plurality of candidate user intents. 3. The computer readable storage medium of claim 1 , wherein each intent deduction score of the plurality of intent deduction scores is based on a quality of match between words in a respective text representation of the plurality of text representations and predefined words corresponding to a domain of a respective candidate user intent of the plurality of candidate user intents. 4. The computer readable storage medium of claim 1 , wherein each intent deduction score of the plurality of intent deduction scores is based on whether or not a property of a domain of a respective candidate user intent of the plurality of candidate user intents can be resolved from a respective text representation of the plurality of text representations. 5. The computer readable storage medium of claim 1 , wherein each intent deduction score of the plurality of intent deduction scores is based on whether or not a natural language processor is able to identify a specific task from a respective text representation of the plurality of text representations. 6. The computer readable storage medium of claim 1 , wherein the selected user intent is determined from a first text representation of the plurality of text representations, and wherein an intent deduction score for the selected user intent is a highest score among the plurality of intent deduction scores, and wherein a speech recognition score for the first text representation is not a highest score among the plurality of speech recognition scores. 7. The computer readable storage medium of claim 1 , wherein the instructions further cause the device to: rank the plurality of candidate user intents according to the plurality of composite scores, wherein the selected user intent has a highest composite score of the plurality of composite scores. 8. An electronic device, comprising: one or more processors; memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for: receiving audio input containing a user utterance; performing speech-to-text processing on the audio input to determine a plurality of text representations of the user utterance and a plurality of speech recognition scores for the plurality of text representations; performing natural language processing on each of the plurality of text representations to determine a plurality of candidate user intents and a plurality of intent deduction scores for the plurality of candidate user intents; determining a plurality of composite scores for the plurality of candidate user intents based on a combination of the plurality of speech recognition scores and the plurality of intent deduction scores; selecting a user intent from the plurality of candidate user intents based on the plurality of composite scores; and performing a task corresponding to the selected user intent. 9. The device of claim 8 , wherein each intent deduction score of the plurality of intent deduction scores is based on a number of words in a respective text representation of the plurality of text representations that correspond to a domain of a respective candidate user intent of the plurality of candidate user intents. 10. The device of claim 8 , wherein each intent deduction score of the plurality of intent deduction scores is based on a quality of match between words in a respective text representation of the plurality of text representations and predefined words corresponding to a domain of a respective candidate user intent of the plurality of candidate user intents. 11. The device of claim 8 , wherein each intent deduction score of the plurality of intent deduction scores is based on whether or not a property of a domain of a respective candidate user intent of the plurality of candidate user intents can be resolved from a respective text representation of the plurality of text representations. 12. The device of claim 8 , wherein each intent deduction score of the plurality of intent deduction scores is based on whether or not a natural language processor is able to identify a specific task from a respective text representation of the plurality of text representations. 13. The device of claim 8 , wherein the selected user intent is determined from a first text representation of the plurality of text representations, and wherein an intent deduction score for the selected user intent is a highest score among the plurality of intent deduction scores, and wherein a speech recognition score for the first text representation is not a highest score among the plurality of speech recognition scores. 14. The device of claim 8 , wherein the one or more programs further include instructions for: ranking the plurality of candidate user intents according to the plurality of composite scores, wherein the selected user intent has a highest composite score of the plurality of composite scores. 15. A method for inferring user intent from speech input, comprising: at an electronic device with one or more processors and memory storing one or more programs for execution by the one or more processors: receiving audio input containing a user utterance; performing speech-to-text processing on the audio input to determine a plurality of text representations of the user utterance and a plurality of speech recognition scores for the plurality text representations; performing natural language processing on each of the plurality of text representations to determine a plurality of candidate user intents and a plurality of intent deduction scores for the plurality of candidate user intents; determining a plurality of composite scores for the plurality of candidate user intents based on a combination of the plurality of speech recognition scores and the plurality of intent deduction scores; selecting a user intent from the plurality of candidate user intents based on the plurality of composite scores; and performing a task corresponding to the selected user intent. 16. The method of claim 15 , wherein each intent deduction score of the plurality of intent deduction scores
Discourse or dialogue representation · CPC title
Speech to text systems (G10L15/08 takes precedence) · CPC title
Parsing for meaning understanding · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.