Process for improving pronunciation of proper nouns foreign to a target language text-to-speech system

US2016358596A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016358596-A1
Application numberUS-201514733289-A
CountryUS
Kind codeA1
Filing dateJun 8, 2015
Priority dateJun 8, 2015
Publication dateDec 8, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method configured for use in a text-to-speech (TTS) system is provided. Embodiments may include identifying, using one or more processors, a word or phrase as a named entity and identifying a language of origin associated with the named entity. Embodiments may further include transliterating the named entity to a script associated with the language of origin. If the TTS system is operating in the language of origin, embodiments may include passing the transliterated script to the TTS system. If the TTS system is not operating in the language of origin, embodiments may include generating a phoneme sequence in the language of origin using a grapheme to phoneme (G2P) converter.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method configured for use in a text-to-speech (TTS) system comprising: identifying, using one or more processors, a word or phrase as a named entity; identifying a language of origin associated with the named entity; transliterating the named entity to a script associated with the language of origin; if the TTS system is operating in the language of origin, passing the transliterated script to the TTS system; and if the TTS system is not operating in the language of origin, generating a phoneme sequence in the language of origin using a grapheme to phoneme (G2P) converter. 2 . The method of claim 1 , further comprising: if the TTS system is not operating in the language of origin, mapping the phoneme sequence to a sequence of target language phonemes. 3 . The method of claim 2 , wherein mapping includes generating a map of most likely unigram, bigram, and trigram mappings from the phoneme sequence to the sequence of target language phonemes. 4 . The method of claim 1 , wherein identifying a word or phrase as a named entity includes one or more of table lookup and contextual analysis. 5 . The method of claim 1 , wherein identifying a language of origin associated with the named entity includes one or more of table lookup and shortest distance measures to an existing names database. 6 . The method of claim 1 , further comprising: augmenting a text to speech dictionary based upon, at least in part, the phoneme sequence. 7 . The method of claim 6 , wherein the text to speech dictionary is associated with an automatic speech recognition (ASR) system. 8 . A non-transitory computer-readable storage medium having stored thereon instructions, which when executed by a processor result in one or more operations configured for use in a text-to-speech (TTS) system, the operations comprising: identifying, using one or more processors, a word or phrase as a named entity; identifying a language of origin associated with the named entity; transliterating the named entity to a script associated with the language of origin; if the TTS system is operating in the language of origin, passing the transliterated script to the TTS system; and if the TTS system is not operating in the language of origin, generating a phoneme sequence in the language of origin using a grapheme to phoneme (G2P) converter. 9 . The non-transitory computer-readable storage medium of claim 8 , further comprising: if the TTS system is not operating in the language of origin, mapping the phoneme sequence to a sequence of target language phonemes. 10 . The non-transitory computer-readable storage medium of claim 9 , wherein mapping includes generating a map of most likely unigram, bigram, and trigram mappings from the phoneme sequence to the sequence of target language phonemes. 11 . The non-transitory computer-readable storage medium of claim 8 , wherein identifying a word or phrase as a named entity includes one or more of table lookup and contextual analysis. 12 . The non-transitory computer-readable storage medium of claim 8 , wherein identifying a language of origin associated with the named entity includes one or more of table lookup and shortest distance measures to an existing names database. 13 . The non-transitory computer-readable storage medium of claim 8 , further comprising: augmenting a text to speech dictionary based upon, at least in part, the phoneme sequence. 14 . The non-transitory computer-readable storage medium of claim 13 , wherein the text to speech dictionary is associated with an automatic speech recognition (ASR) system. 15 . A text to speech system comprising: one or more processors configured to identify a word or phrase as a named entity, the one or more processors further configured to identify a language of origin associated with the named entity and transliterate the named entity to a script associated with the language of origin, if the TTS system is operating in the language of origin, the one or more processors further configured to pass the transliterated script to the TTS system, and if the TTS system is not operating in the language of origin, the one or more processors further configured to generate a phoneme sequence in the language of origin using a grapheme to phoneme (G2P) converter. 16 . The system of claim 15 , wherein if the TTS system is not operating in the language of origin, mapping the phoneme sequence to a sequence of target language phonemes. 17 . The system of claim 16 , wherein mapping includes generating a map of most likely unigram, bigram, and trigram mappings from the phoneme sequence to the sequence of target language phonemes. 18 . The system of claim 15 , wherein identifying a word or phrase as a named entity includes one or more of table lookup and contextual analysis. 19 . The system of claim 15 , wherein identifying a language of origin associated with the named entity includes one or more of table lookup and shortest distance measures to an existing names database. 20 . The system of claim 15 , further comprising: augmenting a text to speech dictionary based upon, at least in part, the phoneme sequence.

Assignees

Inventors

Classifications

  • Named entity recognition · CPC title

  • G10L13/08Primary

    Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination · CPC title

  • Natural language generation · CPC title

  • Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation · CPC title

  • Detection of language · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016358596A1 cover?
A system and method configured for use in a text-to-speech (TTS) system is provided. Embodiments may include identifying, using one or more processors, a word or phrase as a named entity and identifying a language of origin associated with the named entity. Embodiments may further include transliterating the named entity to a script associated with the language of origin. If the TTS system is o…
Who is the assignee on this patent?
Nuance Communications Inc
What technology area does this patent fall under?
Primary CPC classification G10L13/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 08 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).