Word-level correction of speech input
US-9711145-B2 · Jul 18, 2017 · US
US10672394B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10672394-B2 |
| Application number | US-201715849967-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 21, 2017 |
| Priority date | Jan 5, 2010 |
| Publication date | Jun 2, 2020 |
| Grant date | Jun 2, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: providing a transcription of an utterance in a region of a touch-sensitive display; receiving data indicating single touch selection of a particular word in the transcription of the utterance in the region of the touch-sensitive display; determining that the single touch selection of the particular word in the transcription of the utterance in the region of the touch-sensitive display is a particular type of touch selection; and in response to determining that the single touch selection of the particular word in the transcription of the utterance in the region of the touch-sensitive display is the particular type of touch selection, providing an updated transcription of the utterance in the region of the touch-sensitive display. 2. The method of claim 1 , wherein determining that the single touch selection of the particular word in the transcription of the utterance in the region of the touch-sensitive display is a particular type of touch selection comprises determining that the single touch selection of the particular word in the transcription of the utterance in the region of the touch-sensitive display is a long press touch selection. 3. The method of claim 1 , wherein providing the updated transcription of the utterance in the region of the touch-sensitive display comprises providing a new transcription of the utterance in which the particular word is removed. 4. The method of claim 1 , wherein providing the updated transcription of the utterance in the region of the touch-sensitive display comprises providing a new transcription of the utterance in which the particular word is replaced with a different word. 5. The method of claim 1 , wherein providing the updated transcription of the utterance in the region of the touch-sensitive display comprises providing the updated transcription of the utterance without displaying an alternates list. 6. The method of claim 1 , wherein providing the updated transcription of the utterance in the region of the touch-sensitive display comprises a new transcription in which a multi-word phrase that includes the particular word is replaced with a different multi-word phrase that does not include the particular word. 7. The method of claim 1 , comprising: obtaining a word lattice based on performing automated speech recognition on audio data; and selecting the transcription based on the word lattice. 8. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: providing a transcription of an utterance in a region of a touch-sensitive display; receiving data indicating single touch selection of a particular word in the transcription of the utterance in the region of the touch-sensitive display; determining that the single touch selection of the particular word in the transcription of the utterance in the region of the touch-sensitive display is a particular type of touch selection; and in response to determining that the single touch selection of the particular word in the transcription of the utterance in the region of the touch-sensitive display is the particular type of touch selection, providing an updated transcription of the utterance in the region of the touch-sensitive display. 9. The system of claim 8 , wherein determining that the single touch selection of the particular word in the transcription of the utterance in the region of the touch-sensitive display is a particular type of touch selection comprises determining that the single touch selection of the particular word in the transcription of the utterance in the region of the touch-sensitive display is a long press touch selection. 10. The system of claim 8 , wherein providing the updated transcription of the utterance in the region of the touch-sensitive display comprises providing a new transcription of the utterance in which the particular word is removed. 11. The system of claim 8 , wherein providing the updated transcription of the utterance in the region of the touch-sensitive display comprises providing a new transcription of the utterance in which the particular word is replaced with a different word. 12. The system of claim 8 , wherein providing the updated transcription of the utterance in the region of the touch-sensitive display comprises providing the updated transcription of the utterance without displaying an alternates list. 13. The system of claim 8 , wherein providing the updated transcription of the utterance in the region of the touch-sensitive display comprises a new transcription in which a multi-word phrase that includes the particular word is replaced with a different multi-word phrase that does not include the particular word. 14. The system of claim 8 , wherein the operations comprise: obtaining a word lattice based on performing automated speech recognition on audio data; and selecting the transcription based on the word lattice. 15. A computer-readable non-transitory medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising: providing a transcription of an utterance in a region of a touch-sensitive display; receiving data indicating single touch selection of a particular word in the transcription of the utterance in the region of the touch-sensitive display; determining that the single touch selection of the particular word in the transcription of the utterance in the region of the touch-sensitive display is a particular type of touch selection; and in response to determining that the single touch selection of the particular word in the transcription of the utterance in the region of the touch-sensitive display is the particular type of touch selection, providing an updated transcription of the utterance in the region of the touch-sensitive display. 16. The medium of claim 15 , wherein determining that the single touch selection of the particular word in the transcription of the utterance in the region of the touch-sensitive display is a particular type of touch selection comprises determining that the single touch selection of the particular word in the transcription of the utterance in the region of the touch-sensitive display is a long press touch selection. 17. The medium of claim 15 , wherein providing the updated transcription of the utterance in the region of the touch-sensitive display comprises providing a new transcription of the utterance in which the particular word is removed. 18. The medium of claim 15 , wherein providing the updated transcription of the utterance in the region of the touch-sensitive display comprises providing a new transcription of the utterance in which the particular word is replaced with a different word. 19. The medium of claim 15 , wherein providing the updated transcription of the utterance in the region of the touch-sensitive display comprises providing the updated transcription of the utterance without displaying an alternates list. 20. The medium of claim 15 , wherein providing the updated transcription of the utterance in the region of the touch-sensitive display comprises a new transcription in which a multi-word phrase that includes the particular word is replaced with a different multi-word phrase that does not include the particular word.
Orthographic correction, e.g. spell checking or vowelisation · CPC title
Interaction with lists of selectable items, e.g. menus · CPC title
Physics · mapped topic
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Lexical analysis, e.g. tokenisation or collocates · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.