What technology area does this patent fall under?

Primary CPC classification G06F3/0236. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 05 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Corrective feedback loop for automated speech recognition

US9384735B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9384735-B2
Application number	US-201414341054-A
Country	US
Kind code	B2
Filing date	Jul 25, 2014
Priority date	Apr 5, 2007
Publication date	Jul 5, 2016
Grant date	Jul 5, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for facilitating the updating of a language model includes receiving, at a client device, via a microphone, an audio message corresponding to speech of a user; communicating the audio message to a first remote server; receiving, that the client device, a result, transcribed at the first remote server using an automatic speech recognition system (“ASR”), from the audio message; receiving, at the client device from the user, an affirmation of the result; storing, at the client device, the result in association with an identifier corresponding to the audio message; and communicating, to a second remote server, the stored result together with the identifier.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a computing device in communication with an electronic data store, the computing device configured to: obtain audio data comprising speech from a client device; receive an identifier of an application from the client device, wherein the application is associated with an initial language model; generate a transcription of the speech using the initial language model; transmit the transcription to the client device for presentation to a user; receive feedback on the transcription from the client device; and based at least in part on the feedback, generate an updated language model, wherein the electronic data store is configured to store at least one of the initial language model and the updated language model. 2. The system of claim 1 , wherein the feedback comprises at least one of an affirmation of the transcription, a disapproval of the transcription, or a correction to the transcription. 3. The system of claim 1 , wherein the computing device is further configured to generate one or more alternate transcriptions of the speech using the initial language model. 4. The system of claim 2 , wherein the computing device is further configured to transmit the one or more alternate transcriptions to the client device. 5. The system of claim 4 , wherein the feedback comprises a selection of an alternate transcription. 6. The system of claim 4 , wherein the one or more alternate transcriptions each have a transcription confidence value that satisfies a threshold. 7. The system of claim 1 , wherein the electronic data store is further configured to store one or more algorithms that, when executed, implement an automatic speech recognition engine. 8. A non-transitory computer-readable medium having stored thereon a computer-executable component configured to execute in one or more processors of a computing device, the computer-executable component being further configured to: receive first audio data comprising first speech; transcribe the first speech using a first language model to generate a first transcription; provide the first transcription to a first client device; receive feedback on the first transcription from the first client device; based at least in part on the feedback on the first transcription, update the first language model; select a second language model; and based at least in part on the feedback on the transcription, update the second language model, wherein the second language model is not used to generate the first transcription. 9. The non-transitory computer-readable medium of claim 8 , wherein: the first audio data comprising speech is associated with a user of the first client device; and the first language model is associated with the user of the first client device. 10. The non-transitory computer-readable medium of claim 8 , wherein the computer-executable component is further configured to: receive second audio data comprising second speech; and transcribe the second speech with the updated first language model to generate a second transcription. 11. The non-transitory computer-readable medium of claim 8 , wherein the first audio data comprising first speech is received from the first client device. 12. The non-transitory computer-readable medium of claim 8 , wherein the first audio data comprising first speech is received from a second client device. 13. The non-transitory computer-readable medium of claim 8 , wherein the feedback comprises at least one of an affirmation of the first transcription, a disapproval of the first transcription, or a correction to the first transcription. 14. A computer-implemented method comprising: under control of one or more computing devices configured with specific computer-executable instructions, receiving audio data comprising speech from a first client device; receiving an identifier of an application from the first client device, wherein a first language model is associated with the application generating speech recognition results from the speech using the first language model; providing the speech recognition results to the first client device; receiving feedback on the speech recognition results from the first client device; and updating the first language model based at least in part on the feedback. 15. The computer-implemented method of claim 14 , wherein the audio data is received from a second client device. 16. The computer-implemented method of claim 14 , wherein the speech recognition results comprise a transcription of the speech. 17. The computer-implemented method of claim 16 , wherein the feedback relates to at least one of a letter of the transcription, a syllable of the transcription, a word of the transcription, a phrase of the transcription, or a sentence of the transcription. 18. The computer-implemented method of claim 16 , further comprising: generating a transcription identifier associated with the transcription; transmitting the identifier to the first client device with the transcription; and receiving the identifier from the first client device with the feedback on the speech recognition results. 19. The computer-implemented method of claim 14 , further comprising generating one or more alternative speech recognition results using the first language model. 20. The computer-implemented method of claim 19 , further comprising providing to the first client device an alternative speech recognition result from the one or more alternative speech recognition results with a confidence value that satisfies a threshold.

Assignees

Amazon Tech Inc

Inventors

Classifications

G10L15/30
Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title
G10L15/19
Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules · CPC title
G06F3/0236Primary
using selection techniques to select from displayed items · CPC title
G10L15/26Primary
Speech to text systems (G10L15/08 takes precedence) · CPC title
G10L15/22
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

View patent family 41089753

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9384735B2 cover?: A method for facilitating the updating of a language model includes receiving, at a client device, via a microphone, an audio message corresponding to speech of a user; communicating the audio message to a first remote server; receiving, that the client device, a result, transcribed at the first remote server using an automatic speech recognition system (“ASR”), from the audio message; receivin…
Who is the assignee on this patent?: Amazon Tech Inc
What technology area does this patent fall under?: Primary CPC classification G06F3/0236. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 05 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).