Re-recognizing speech with external data sources

US2017301352A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2017301352-A1
Application numberUS-201715637526-A
CountryUS
Kind codeA1
Filing dateJun 29, 2017
Priority dateFeb 5, 2016
Publication dateOct 19, 2017
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, including computer programs encoded on a computer storage medium, for improving speech recognition based on external data sources. In one aspect, a method includes obtaining an initial candidate transcription of an utterance using an automated speech recognizer and identifying, based on a language model that is not used by the automated speech recognizer in generating the initial candidate transcription, one or more terms that are phonetically similar to one or more terms that do occur in the initial candidate transcription. Additional actions include generating one or more additional candidate transcriptions based on the identified one or more terms and selecting a transcription from among the candidate transcriptions.

First claim

Opening claim text (preview).

What is claimed is: 1 . (canceled) 2 . A computer-implemented method comprising: providing an utterance to a speech recognizer that uses a language model that includes a specified vocabulary; and based on processing the utterance using the speech recognizer and a post-processor, generating a transcription of the utterance that includes a term that is not in the specified vocabulary of the speech recognizer. 3 . The method of claim 2 , wherein the language model indicates likelihoods that words or sequences of words in the specified vocabulary appear. 4 . The method of claim 2 , wherein the post-processor uses a second language model that includes the term that is not in the specified vocabulary of the speech recognizer. 5 . The method of claim 4 , wherein the second language model indicates likelihoods that words or sequences of words in another specified vocabulary that includes the term appear. 6 . The method of claim 2 , wherein based on processing the utterance using the speech recognizer and a post-processor, generating a transcription of the utterance that includes a term that is not in the specified vocabulary of the speech recognizer comprises: obtaining, from the speech recognizer, an initial transcription of the utterance that does not include the term; and generating the transcription that includes the term from the initial transcription. 7 . The method of claim 6 , wherein generating the transcription that includes the term from the initial transcription comprises: receiving, from the speech recognizer, an acoustic match score that reflects a phonetic similarity between the initial transcription and the utterance; and generating the transcription that includes the term from the initial transcription with the acoustic match score. 8 . The method of claim 2 , wherein providing an utterance to a speech recognizer that uses a language model that includes a specified vocabulary comprises: providing acoustic data that reflects the utterance to the speech recognizer. 9 . A system comprising: one or more computers; and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: providing an utterance to a speech recognizer that uses a language model that includes a specified vocabulary; and based on processing the utterance using the speech recognizer and a post-processor, generating a transcription of the utterance that includes a term that is not in the specified vocabulary of the speech recognizer. 10 . The system of claim 9 , wherein the language model indicates likelihoods that words or sequences of words in the specified vocabulary appear. 11 . The system of claim 9 , wherein the post-processor uses a second language model that includes the term that is not in the specified vocabulary of the speech recognizer. 12 . The system of claim 11 , wherein the second language model indicates likelihoods that words or sequences of words in another specified vocabulary that includes the term appear. 13 . The system of claim 9 , wherein based on processing the utterance using the speech recognizer and a post-processor, generating a transcription of the utterance that includes a term that is not in the specified vocabulary of the speech recognizer comprises: obtaining, from the speech recognizer, an initial transcription of the utterance that does not include the term; and generating the transcription that includes the term from the initial transcription. 14 . The system of claim 13 , wherein generating the transcription that includes the term from the initial transcription comprises: receiving, from the speech recognizer, an acoustic match score that reflects a phonetic similarity between the initial transcription and the utterance; and generating the transcription that includes the term from the initial transcription with the acoustic match score. 15 . The system of claim 9 , wherein providing an utterance to a speech recognizer that uses a language model that includes a specified vocabulary comprises: providing acoustic data that reflects the utterance to the speech recognizer. 16 . A non-transitory computer-readable medium storing instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising: providing an utterance to a speech recognizer that uses a language model that includes a specified vocabulary; and based on processing the utterance using the speech recognizer and a post-processor, generating a transcription of the utterance that includes a term that is not in the specified vocabulary of the speech recognizer. 17 . The medium of claim 16 , wherein the language model indicates likelihoods that words or sequences of words in the specified vocabulary appear. 18 . The medium of claim 16 , wherein the post-processor uses a second language model that includes the term that is not in the specified vocabulary of the speech recognizer. 19 . The medium of claim 18 , wherein the second language model indicates likelihoods that words or sequences of words in another specified vocabulary that includes the term appear. 20 . The medium of claim 16 , wherein based on processing the utterance using the speech recognizer and a post-processor, generating a transcription of the utterance that includes a term that is not in the specified vocabulary of the speech recognizer comprises: obtaining, from the speech recognizer, an initial transcription of the utterance that does not include the term; and generating the transcription that includes the term from the initial transcription. 21 . The medium of claim 20 , wherein generating the transcription that includes the term from the initial transcription comprises: receiving, from the speech recognizer, an acoustic match score that reflects a phonetic similarity between the initial transcription and the utterance; and generating the transcription that includes the term from the initial transcription with the acoustic match score.

Assignees

Inventors

Classifications

  • G10L15/19Primary

    Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules · CPC title

  • for comparison or discrimination · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • G10L15/32Primary

    Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title

  • Feature extraction for speech recognition; Selection of recognition unit · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017301352A1 cover?
Methods, including computer programs encoded on a computer storage medium, for improving speech recognition based on external data sources. In one aspect, a method includes obtaining an initial candidate transcription of an utterance using an automated speech recognizer and identifying, based on a language model that is not used by the automated speech recognizer in generating the initial candi…
Who is the assignee on this patent?
Google Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/19. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 19 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).