What technology area does this patent fall under?

Primary CPC classification G10L15/22. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Word-level correction of speech input

US9711145B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9711145-B2
Application number	US-201615350309-A
Country	US
Kind code	B2
Filing date	Nov 14, 2016
Priority date	Jan 5, 2010
Publication date	Jul 18, 2017
Grant date	Jul 18, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: providing, for output, a first user interface that includes a virtual keyboard including a control for initiating speech-to-text input; receiving (i) data indicating a selection of the control included in the virtual keyboard that is included in the user interface, and (ii) audio data comprising an utterance that was spoken after the control included in the virtual keyboard was selected; generating, by an automated speech recognizer, a speech recognition lattice corresponding to the utterance that was spoken after the control included in the virtual keyboard was selected; and providing, for output, a second user interface that includes a representation of the speech recognition lattice corresponding to the utterance that was spoken after the control included in the virtual keyboard was selected. 2. The computer-implemented method of claim 1 , wherein the representation of the speech recognition lattice is a transcription corresponding to the utterance that was spoken after the control included in the keyboard was selected that is identified as a best hypothesis from among one or more transcriptions corresponding to the utterance that was spoken after the control included in the keyboard was selected. 3. The computer-implemented method of claim 1 , comprising: receiving data indicating a selection of a portion of the representation of the speech recognition lattice; providing, for output, a third user interface that includes one or more alternates for the selected portion of the representation of the speech recognition lattice; receiving data indicating a selection of a particular alternate from among the one or more alternates for the selected portion of the representation of the speech recognition lattice; and providing, for output, a fourth user interface that includes a second representation of the speech recognition lattice that includes the particular alternate selected from among the one or more alternates for the selected portion of the representation of the speech recognition lattice. 4. The computer-implemented method of claim 1 , comprising: receiving data indicating a selection of a portion of the representation of the speech recognition lattice; providing, for output, a third user interface that includes a control for removing the selected portion of the representation of the speech recognition lattice; receiving data indicating a selection of the control for removing the selected portion of the representation of the speech recognition lattice; and providing, for output, a fourth user interface that includes a second representation of the speech recognition lattice that does not include the selected portion of the representation of the speech recognition lattice. 5. The computer-implemented method of claim 1 , wherein the second user interface includes a second representation of the speech recognition lattice corresponding to the utterance that was spoken after the control included in the virtual keyboard was selected. 6. The computer-implemented method of claim 5 , comprising: providing, for output in the second user interface, a control for replacing the representation of the speech recognition lattice with the second representation of the speech recognition lattice; receiving data indicating a selection of the control for replacing the representation of the speech recognition lattice with the second representation of the speech recognition lattice; and providing, for output, a third user interface that includes the second representation of the speech recognition lattice in place of the representation of the speech recognition lattice. 7. The computer-implemented method of claim 1 , comprising: receiving data indicating a selection of a portion of the representation of the speech recognition lattice; and providing, for output, a third user interface that includes a second representation of the speech recognition lattice corresponding to the utterance that was spoken after the control included in the virtual keyboard was selected, wherein a portion of the second representation of the speech recognition lattice that corresponds to the selected portion of the representation of the speech recognition lattice is different from the selected portion of the representation of the speech recognition lattice. 8. A system for correcting words in transcribed text, the system comprising: an automated speech recognizer operable to receive speech audio data and in response transcribe the speech audio data in a word lattice; and a computing device comprising: a microphone operable to receive speech audio and generate the speech audio data, a network interface operable to send the speech audio data to the automated speech recognizer and in response receive the word lattice from the automated speech recognizer, a display screen operable to present one or more transcribed words from the word lattice, a user interface operable to receive a user selection of at least one of the transcribed words, and one or more processors and a memory storing instructions that when executed by the processors cause the computing device to perform operations to: provide, for output, a first user interface that includes a virtual keyboard including a control for initiating speech-to-text input; receive (i) data indicating a selection of the control included in the virtual keyboard that is included in the user interface, and (ii) audio data comprising an utterance that was spoken after the control included in the virtual keyboard was selected; generate, by the automated speech recognizer, a speech recognition lattice corresponding to the utterance that was spoken after the control included in the virtual keyboard was selected; and provide, for output, a second user interface that includes a representation of the speech recognition lattice corresponding to the utterance that was spoken after the control included in the virtual keyboard was selected. 9. The system of claim 8 , wherein the representation of the speech recognition lattice is a transcription corresponding to the utterance that was spoken after the control included in the keyboard was selected that is identified as a best hypothesis from among one or more transcriptions corresponding to the utterance that was spoken after the control included in the keyboard was selected. 10. The system of claim 8 , wherein the operations comprise: receiving data indicating a selection of a portion of the representation of the speech recognition lattice; providing, for output, a third user interface that includes one or more alternates for the selected portion of the representation of the speech recognition lattice; receiving data indicating a selection of a particular alternate from among the one or more alternates for the selected portion of the representation of the speech recognition lattice; and providing, for output, a fourth user interface that includes a second representation of the speech recognition lattice that includes the particular alternate selected from among the one or more alternates for the selected portion of the representation of the speech recognition lattice. 11. The system of claim 8 , wherein the operations comprise: receiving data indicating a selection of a portion of the representation of the speech recognition lattice; providing, for output, a third user interface that includes a control for removing the selected portion of the representation of the speech recognition lattice; receiving data indicating a selection of the control for removing the selected portion of the representation of the speech recognition lattice; and providing, for outpu

Assignees

Google Inc

Inventors

Classifications

G10L15/22Primary
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L15/30
Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title
G10L15/01
Assessment or evaluation of speech recognition systems · CPC title
G06F3/04886
by partitioning the display area of the touch-screen or the surface of the digitising tablet into independently controllable areas, e.g. virtual keyboards or menus · CPC title
G10L15/28
Constructional details of speech recognition systems · CPC title

Patent family

Related publications grouped by family.

View patent family 44225217

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9711145B2 cover?: The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the t…
Who is the assignee on this patent?: Google Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).