Dialogue evaluation via multiple hypothesis ranking
US-2015142420-A1 · May 21, 2015 · US
US9953637B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9953637-B1 |
| Application number | US-201414225135-A |
| Country | US |
| Kind code | B1 |
| Filing date | Mar 25, 2014 |
| Priority date | Mar 25, 2014 |
| Publication date | Apr 24, 2018 |
| Grant date | Apr 24, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Features are disclosed for processing user utterances and applying user-supplied corrections to future user utterances. If a user utterance is determined to relate to a speech processing error that occurred when processing a previous utterance, information about the error or a correction thereto may be stored. Such information may be referred to as correction information. Illustratively, the correction information may be stored in a skip list. Subsequent utterances may be processed based at least partly on the correction information. For example, speech processing results generated from processing subsequent utterances that include a term associated with the error may be removed or re-scored in order to reduce or prevent the chance that an error will be repeated.
Opening claim text (preview).
What is claimed is: 1. A system for executing an action based on an utterance, the system comprising: a computer-readable memory storing executable instructions; and one or more processors in communication with the computer-readable memory, wherein the one or more processors are programmed by the executable instructions to at least: obtain first audio data regarding a first utterance of a user; generate first speech processing results based at least partly on the first audio data, the first speech processing results comprising a first semantic representation of at least a portion of the first utterance; generate a response for presentation to the user, the response related to the first semantic representation; obtain second audio data regarding a second utterance of the user; generate second speech processing results based at least partly on the second audio data; determine, based at least partly on the second speech processing results, that the second utterance relates to an error in the first speech processing results, wherein the first semantic representation is associated with the error; generate, using the first semantic representation, a plurality of textual representations, wherein individual textual representations of the plurality of textual representations are associated with a meaning corresponding to the first semantic representation; add an entry to an error list separate from the first speech processing results and the second speech processing results, wherein the entry indicates the plurality of textual representations are erroneous; generate, subsequent to adding the entry to the error list and prior to obtaining third audio data regarding a third utterance of the user, speech processing results for a plurality of intervening utterances of the user; obtain the third audio data regarding the third utterance of the user; generate third speech processing results based at least partly on the third audio data, wherein the third speech processing results comprise a first speech processing hypothesis and a second speech processing hypothesis, wherein the first speech processing hypothesis is associated with a first executable action, and wherein the second speech processing hypothesis is associated with a second executable action distinct from the first executable action; determine, using the error list, that at least a portion of the first speech processing hypothesis corresponds to a textual representation of the plurality of textual representations; remove the first speech processing hypothesis from the third speech processing results based at least partly on the portion of the first speech processing hypothesis corresponding to the textual representation; and execute the second executable action instead of the first executable action based at least partly on the second speech processing hypothesis remaining in the third speech processing results after the first speech processing hypothesis is removed. 2. The system of claim 1 , wherein the error comprises one of: an automatic speech recognition misrecognition, or a natural language understanding misinterpretation. 3. The system of claim 1 , wherein the instructions to add the entry to the error list comprise instructions to perform natural language generation using the first semantic representation to generate the plurality of textual representations of the first semantic representation. 4. The system of claim 1 , wherein the instructions to determine that at least the portion of the speech processing hypothesis corresponds to the entry in the error list comprise instructions to: compare at least a portion of the plurality of speech processing hypotheses to at least a portion of entries in the error list; and determine that at least the portion of the speech processing hypothesis is equal to at least a portion of the entry in the error list. 5. A computer-implemented method for executing an action based on audio data, the computer-implemented method comprising: under control of one or more computing devices configured with specific computer-executable instructions, generating first speech processing results comprising a first semantic representation of at least a portion of a first user utterance, the first speech processing results generated using a speech processing system and audio data regarding at least the portion of the first user utterance; determining that a second semantic representation of at least a portion of a second user utterance relates to a correction to the first semantic representation; generating, using the first semantic representation, one or more lexical representations associated with a meaning corresponding to the first semantic representation; storing correction information in an error list separate from the first speech processing results and separate from second speech processing results comprising the second semantic representation, wherein the correction information indicates that the one or more lexical representations are erroneous; determining, using the correction information, that at least a portion of a first speech processing hypothesis, of a plurality of speech processing hypotheses for a third user utterance, corresponds to a lexical representation of the one or more lexical representations; removing the first speech processing hypothesis from the plurality of speech processing hypotheses based at least partly on the determining that at least the portion of the first speech processing hypothesis corresponds to the lexical representation; generating a third semantic representation of at least a portion of the third user utterance using a second speech processing hypothesis of the plurality of speech processing hypotheses instead of the first speech processing hypothesis based at least partly on the second speech processing hypothesis remaining in the plurality of speech processing hypotheses after the first speech processing hypothesis is removed, wherein the first speech processing hypothesis is associated with a first executable action, and wherein the second speech processing hypotheses is associated with a second executable action distinct from the first executable action; and executing the second executable action. 6. The computer-implemented method of claim 5 , wherein the correction information comprises at least a portion of the first semantic representation. 7. The computer-implemented method of claim 5 , wherein the correction information comprises information regarding one of: an intent or a slot value. 8. The computer-implemented method of claim 5 , wherein the first speech processing hypothesis comprises the first semantic representation. 9. The computer-implemented method of claim 8 , further comprising modifying a score associated with the first speech processing hypothesis in the plurality of speech processing hypotheses. 10. The computer-implemented method of claim 5 , further comprising modifying at least one of an automatic speech recognition model or a natural language understanding model based at least partly on the correction information. 11. The computer-implemented method of claim 5 , wherein the correction relates to one of: an automatic speech recognition misrecognition, or a natural language understanding misinterpretation. 12. The computer-implemented method of claim 5 , wherein the generating the one or more lexical representations comprises expanding, using a natural language generation component, the first semantic representation into the one or more lexical representations. 13. The computer-implemented method of claim 5 , wherein determining that at least the portion of the first speech processing
Related publications grouped by family.
Answers are generated from the same data shown on this page.