Cross-Language Speech Recognition and Translation

US2016336008A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016336008-A1
Application numberUS-201514714046-A
CountryUS
Kind codeA1
Filing dateMay 15, 2015
Priority dateMay 15, 2015
Publication dateNov 17, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Technologies are described herein for cross-language speech recognition and translation. An example method of speech recognition and translation includes receiving an input utterance in a first language, the input utterance having at least one name of a named entity included therein and being pronounced in a second language, utilizing a customized language model to process at least a portion of the input utterance, and identifying the at least one name of the named entity from the input utterance utilizing a phonetic representation of the at least one name of the named entity. The phonetic representation has a pronunciation of the at least one name in the second language.

First claim

Opening claim text (preview).

What is claimed is: 1 . A device for speech recognition comprising a speech recognition component deployed thereon and configured to: receive an input utterance in a first language, the input utterance having at least one name of a named entity included therein and being pronounced in a second language; utilize a customized language model to process at least a portion of the input utterance; and identifying the at least one name of the named entity from the input utterance utilizing a phonetic representation of the at least one name of the named entity, the phonetic representation having a pronunciation of the at least one name in the second language. 2 . The device of claim 1 , wherein the speech recognition component is further configured to: create an output utterance based on the input utterance, the output utterance comprising one or more of: a phonetic representation of the at least one name of the named entity in the second language; or a phonetic representation of the at least one name of the named entity in the first language. 3 . The device of claim 1 , wherein the customized language model comprises a context-free language model or an n-gram language model. 4 . The device of claim 1 , wherein the speech recognition component is further configured to: retrieve the phonetic representation from a lexicon of phonetic pronunciations of names for named entities, the lexicon including a plurality of pronunciations in both the first language and the second language for the same names of named entities. 5 . The device of claim 1 , wherein the speech recognition component is further configured to output an output utterance comprising the at least one name of the named entity to a communication application in operative communication with the computer. 6 . A method of speech recognition and translation for processing utterances in both a first language and a second language, the method comprising performing computer-implemented operations at a computing network including: categorizing names of named entities associated with a first user, the names being in the first language; constructing a lexicon of phonetic pronunciations of the names for the named entities, the lexicon including a plurality of pronunciations in the first language and the second language; constructing a customized language model for each type of named entity of the named entities; and processing utterances received from the first user in the first language to recognize names of named entities, the names of named entities comprising names pronounced in the second language. 7 . The method of claim 6 , further comprising: collecting the names of the named entities from one or more sources of named entities, the one or more sources of named entities being associated with the first user. 8 . The method of claim 7 , wherein the one or more sources of named entities comprises at least one of: a contact list associated with the first user; location information associated with the first user; conversation data associated with the first user; or social media data associated with the first user. 9 . The method of claim 7 , wherein the utterances received from the first user are created in a communication application, and wherein the one or more sources of named entities are retrieved from the communication application. 10 . The method of claim 6 , wherein categorizing the named entities comprises categorizing named entities as a name of a person or a name of a geographic location. 11 . The method of claim 10 , wherein categorizing the named entities further comprises categorizing named entities as out of vocabulary (OOV) entities. 12 . The method of claim 6 , wherein constructing the lexicon of phonetic pronunciations comprises: mapping letters of a name of a named entity using a set of language rules for the first language; converting the mapped letters of the name to a standard phonetic representation; converting the standard phonetic representation to a phonetic representation of pronunciation in the second language; and adding the phonetic representation of the pronunciation to the lexicon of phonetic pronunciations. 13 . The method of claim 6 , further comprising: categorizing new names of named entities associated with a second user, the new names being in the second language; and constructing a lexicon of phonetic pronunciations for the named entities, the lexicon including a plurality of pronunciations in the first language and the second language. 14 . The method of claim 13 , further comprising: constructing the customized language model for at least one type of named entity of the new names of named entities. 15 . The method of claim 14 , further comprising: translating utterances received from the second user in the second language to new output utterances in the first language, the new output utterances comprising at least one phonetic pronunciation of a new name of the named entities in the first language. 16 . A speech recognition and translation system configured to translate a first utterance in a first language into a second utterance in a second language, the system comprising at least one computer executing a speech recognition component configured to: receive an input utterance in the first language, the input utterance having at least one name of a named entity included therein; utilize a customized language model or a generic language model to translate a portion of the input utterance into an output utterance in the second language; identify the at least one name of the named entity from the input utterance; determine a phonetic representation of the at least one name of the named entity to the output utterance, the phonetic representation having a pronunciation of the at least one name in the second language; and output the output utterance according to the phonetic representation. 17 . The system of claim 16 , further comprising a named entity categorization component configured to categorize names of named entities as a name of a person, a name of a geographic location, or the name of an object. 18 . The system of claim 16 , further comprising a cross-language lexicon component configured to construct a lexicon of phonetic pronunciations of names for named entities, the lexicon including a plurality of pronunciations in the second language. 19 . The system of claim 18 , wherein constructing the lexicon of phonetic pronunciations comprises: mapping letters of a name of a named entity using a set of language rules for the first language; converting the mapped letters of the name to a standard phonetic representation; converting the standard phonetic representation to a phonetic representation of pronunciation in the second language; and adding the phonetic representation of the pronunciation to the lexicon of phonetic pronunciations. 20 . The system of claim 16 , further comprising a customized language model component configured to construct the customized language model.

Assignees

Inventors

Classifications

  • Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice (G10L15/14 takes precedence) · CPC title

  • Statistical methods, e.g. probability models · CPC title

  • Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules · CPC title

  • Named entity recognition · CPC title

  • G10L15/187Primary

    Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016336008A1 cover?
Technologies are described herein for cross-language speech recognition and translation. An example method of speech recognition and translation includes receiving an input utterance in a first language, the input utterance having at least one name of a named entity included therein and being pronounced in a second language, utilizing a customized language model to process at least a portion of…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/187. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Nov 17 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).