What technology area does this patent fall under?

Primary CPC classification G10L15/08. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 24 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Speech processing using skip lists

US9953637B1 · US · B1

Patent metadata
Field	Value
Publication number	US-9953637-B1
Application number	US-201414225135-A
Country	US
Kind code	B1
Filing date	Mar 25, 2014
Priority date	Mar 25, 2014
Publication date	Apr 24, 2018
Grant date	Apr 24, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Features are disclosed for processing user utterances and applying user-supplied corrections to future user utterances. If a user utterance is determined to relate to a speech processing error that occurred when processing a previous utterance, information about the error or a correction thereto may be stored. Such information may be referred to as correction information. Illustratively, the correction information may be stored in a skip list. Subsequent utterances may be processed based at least partly on the correction information. For example, speech processing results generated from processing subsequent utterances that include a term associated with the error may be removed or re-scored in order to reduce or prevent the chance that an error will be repeated.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for executing an action based on an utterance, the system comprising: a computer-readable memory storing executable instructions; and one or more processors in communication with the computer-readable memory, wherein the one or more processors are programmed by the executable instructions to at least: obtain first audio data regarding a first utterance of a user; generate first speech processing results based at least partly on the first audio data, the first speech processing results comprising a first semantic representation of at least a portion of the first utterance; generate a response for presentation to the user, the response related to the first semantic representation; obtain second audio data regarding a second utterance of the user; generate second speech processing results based at least partly on the second audio data; determine, based at least partly on the second speech processing results, that the second utterance relates to an error in the first speech processing results, wherein the first semantic representation is associated with the error; generate, using the first semantic representation, a plurality of textual representations, wherein individual textual representations of the plurality of textual representations are associated with a meaning corresponding to the first semantic representation; add an entry to an error list separate from the first speech processing results and the second speech processing results, wherein the entry indicates the plurality of textual representations are erroneous; generate, subsequent to adding the entry to the error list and prior to obtaining third audio data regarding a third utterance of the user, speech processing results for a plurality of intervening utterances of the user; obtain the third audio data regarding the third utterance of the user; generate third speech processing results based at least partly on the third audio data, wherein the third speech processing results comprise a first speech processing hypothesis and a second speech processing hypothesis, wherein the first speech processing hypothesis is associated with a first executable action, and wherein the second speech processing hypothesis is associated with a second executable action distinct from the first executable action; determine, using the error list, that at least a portion of the first speech processing hypothesis corresponds to a textual representation of the plurality of textual representations; remove the first speech processing hypothesis from the third speech processing results based at least partly on the portion of the first speech processing hypothesis corresponding to the textual representation; and execute the second executable action instead of the first executable action based at least partly on the second speech processing hypothesis remaining in the third speech processing results after the first speech processing hypothesis is removed. 2. The system of claim 1 , wherein the error comprises one of: an automatic speech recognition misrecognition, or a natural language understanding misinterpretation. 3. The system of claim 1 , wherein the instructions to add the entry to the error list comprise instructions to perform natural language generation using the first semantic representation to generate the plurality of textual representations of the first semantic representation. 4. The system of claim 1 , wherein the instructions to determine that at least the portion of the speech processing hypothesis corresponds to the entry in the error list comprise instructions to: compare at least a portion of the plurality of speech processing hypotheses to at least a portion of entries in the error list; and determine that at least the portion of the speech processing hypothesis is equal to at least a portion of the entry in the error list. 5. A computer-implemented method for executing an action based on audio data, the computer-implemented method comprising: under control of one or more computing devices configured with specific computer-executable instructions, generating first speech processing results comprising a first semantic representation of at least a portion of a first user utterance, the first speech processing results generated using a speech processing system and audio data regarding at least the portion of the first user utterance; determining that a second semantic representation of at least a portion of a second user utterance relates to a correction to the first semantic representation; generating, using the first semantic representation, one or more lexical representations associated with a meaning corresponding to the first semantic representation; storing correction information in an error list separate from the first speech processing results and separate from second speech processing results comprising the second semantic representation, wherein the correction information indicates that the one or more lexical representations are erroneous; determining, using the correction information, that at least a portion of a first speech processing hypothesis, of a plurality of speech processing hypotheses for a third user utterance, corresponds to a lexical representation of the one or more lexical representations; removing the first speech processing hypothesis from the plurality of speech processing hypotheses based at least partly on the determining that at least the portion of the first speech processing hypothesis corresponds to the lexical representation; generating a third semantic representation of at least a portion of the third user utterance using a second speech processing hypothesis of the plurality of speech processing hypotheses instead of the first speech processing hypothesis based at least partly on the second speech processing hypothesis remaining in the plurality of speech processing hypotheses after the first speech processing hypothesis is removed, wherein the first speech processing hypothesis is associated with a first executable action, and wherein the second speech processing hypotheses is associated with a second executable action distinct from the first executable action; and executing the second executable action. 6. The computer-implemented method of claim 5 , wherein the correction information comprises at least a portion of the first semantic representation. 7. The computer-implemented method of claim 5 , wherein the correction information comprises information regarding one of: an intent or a slot value. 8. The computer-implemented method of claim 5 , wherein the first speech processing hypothesis comprises the first semantic representation. 9. The computer-implemented method of claim 8 , further comprising modifying a score associated with the first speech processing hypothesis in the plurality of speech processing hypotheses. 10. The computer-implemented method of claim 5 , further comprising modifying at least one of an automatic speech recognition model or a natural language understanding model based at least partly on the correction information. 11. The computer-implemented method of claim 5 , wherein the correction relates to one of: an automatic speech recognition misrecognition, or a natural language understanding misinterpretation. 12. The computer-implemented method of claim 5 , wherein the generating the one or more lexical representations comprises expanding, using a natural language generation component, the first semantic representation into the one or more lexical representations. 13. The computer-implemented method of claim 5 , wherein determining that at least the portion of the first speech processing

Assignees

Amazon Tech Inc

Inventors

Classifications

G10L15/08Primary
Speech classification or search · CPC title
G10L15/22Primary
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L15/1815
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title

Patent family

Related publications grouped by family.

View patent family 61951773

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9953637B1 cover?: Features are disclosed for processing user utterances and applying user-supplied corrections to future user utterances. If a user utterance is determined to relate to a speech processing error that occurred when processing a previous utterance, information about the error or a correction thereto may be stored. Such information may be referred to as correction information. Illustratively, the co…
Who is the assignee on this patent?: Amazon Tech Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/08. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 24 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).