Word-level correction of speech input

US9263048B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9263048-B2
Application numberUS-201514747306-A
CountryUS
Kind codeB2
Filing dateJun 23, 2015
Priority dateJan 5, 2010
Publication dateFeb 16, 2016
Grant dateFeb 16, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: providing a first transcription of an utterance, wherein the first transcription of the utterance includes one or more words; receiving data indicating a selection of a word from among the one or more words included in the first transcription of the utterance; in response to receiving the data indicating the selection of the word, providing one or more alternate words for the selected word; receiving data indicating a selection of a particular alternate word from among the one or more alternate words for the selected word; selecting a second transcription of the utterance that includes the particular alternate word and that is identified as having a speech recognition confidence measure value that satisfies one or more criteria; and replacing the first transcription of the utterance with the second transcription of the utterance. 2. The computer-implemented method of claim 1 , wherein the one or more words are selected from a hierarchical word lattice. 3. The computer-implemented method of claim 2 , wherein the hierarchical word lattice comprises nodes corresponding to the one or more words of the first transcription of the utterance and words of the second transcription of the utterance, and edges between the nodes that identify possible paths through the word lattice, wherein each path has an associated probability of being correct. 4. The computer-implemented method of claim 3 , wherein providing the one or more alternate words for the selected word comprises: identifying one or more alternate words for the selected word for which an edge exists to the other words of the first transcription in the hierarchical word lattice; and providing the one or more alternate words for the selected word for which an edge exists to the other words of the first transcription as the one or more alternate words for the selected word, without providing other words from the hierarchical word lattice for which an edge does not exist to the other words of the first transcription as alternate words for the selected word. 5. The computer-implemented method of claim 1 , wherein the first transcription of the utterance is a transcription of the utterance that is identified as having a highest speech recognizer confidence measure value among all transcriptions of the utterance. 6. The computer-implemented method of claim 1 , wherein the second transcription of the utterance is a transcription of the utterance that includes the particular alternate word and that is identified as having a highest speech recognizer confidence measure value among all transcriptions of the utterance that include the particular alternate word. 7. The computer-implemented method of claim 1 , wherein: the first transcription of the utterance and the second transcription of the utterance are provided for output at a touchscreen display of a computing device; and the data indicating the selection of the word from among the one or more words included in the first transcription and the data indicating the selection of the particular alternate word from among the one or more alternate words for the selected word are received in response to user input at the touchscreen display of the computing device. 8. A system for correcting words in transcribed text, the system comprising: an automated speech recognizer operable to receive speech audio data and in response transcribe the speech audio data in a word lattice; and a computing device comprising: a microphone operable to receive speech audio and generate the speech audio data, a network interface operable to send the speech audio data to the automated speech recognizer and in response receive the word lattice from the automated speech recognizer, a display screen operable to present one or more transcribed words from the word lattice, a user interface operable to receive a user selection of at least one of the transcribed words, and one or more processors and a memory storing instructions that when executed by the processors cause the computing device to perform operations to: provide a first transcription of an utterance, wherein the first transcription of the utterance includes one or more words; receive data indicating a selection of a word from among the one or more words included in the first transcription of the utterance; in response to receiving the data indicating the selection of the word, provide one or more alternate words for the selected word; receive data indicating a selection of a particular alternate word from among the one or more alternate words for the selected word; select a second transcription of the utterance that includes the particular alternate word and that is identified as having a speech recognition confidence measure value that satisfies one or more criteria; and replace the first transcription of the utterance with the second transcription of the utterance. 9. The system of claim 8 , wherein the one or more words are selected from a hierarchical word lattice. 10. The system of claim 9 , wherein the hierarchical word lattice comprises nodes corresponding to the one or more words of the first transcription of the utterance and words of the second transcription of the utterance, and edges between the nodes that identify possible paths through the word lattice, wherein each path has an associated probability of being correct. 11. The system of claim 10 , wherein providing the one or more alternate words for the selected word comprises: identifying one or more alternate words for the selected word for which an edge exists to the other words of the first transcription in the hierarchical word lattice; and providing the one or more alternate words for the selected word for which an edge exists to the other words of the first transcription as the one or more alternate words for the selected word, without providing other words from the hierarchical word lattice for which an edge does not exist to the other words of the first transcription as alternate words for the selected word. 12. The system of claim 8 , wherein the first transcription of the utterance is a transcription of the utterance that is identified as having a highest speech recognizer confidence measure value among all transcriptions of the utterance. 13. The system of claim 8 , wherein the second transcription of the utterance is a transcription of the utterance that includes the particular alternate word and that is identified as having a highest speech recognizer confidence measure value among all transcriptions of the utterance that include the particular alternate word. 14. The system of claim 8 , wherein: the first transcription of the utterance and the second transcription of the utterance are provided for output at a touchscreen display of a computing device; and the data indicating the selection of the word from among the one or more words included in the first transcription and the data indicating the selection of the particular alternate word from among the one or more alternate words for the selected word are received in response to user input at the touchscreen display of the computing device. 15. A computer program product, encoded on a non-transitory computer-readable medium, operable to cause one or more processors to perform operations for correcting words in transcribed text, the operations comprising: providing a first transcription of an utterance, wherein the first transcription of the utterance includes one or more words; receiving data indicating a selection of a word from among the one or more words included in the

Assignees

Inventors

Classifications

  • Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

  • Orthographic correction, e.g. spell checking or vowelisation · CPC title

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Assessment or evaluation of speech recognition systems · CPC title

  • Sound input; Sound output (speech processing G10L) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9263048B2 cover?
The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the t…
Who is the assignee on this patent?
Google Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 16 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).