Process for improving pronunciation of proper nouns foreign to a target language text-to-speech system
US-2016358596-A1 · Dec 8, 2016 · US
US10102189B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10102189-B2 |
| Application number | US-201514977090-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 21, 2015 |
| Priority date | Dec 21, 2015 |
| Publication date | Oct 16, 2018 |
| Grant date | Oct 16, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided are methods, devices, and computer-readable media for generating a string of characters based on a set of rules; parsing the string of characters into string of graphemes; determining one or more phonetic representations for one or more graphemes in the string of graphemes based on a first data structure; determining at least one grapheme representation for one or more of the one or more phonetic representations based on a second data structure; and constructing the phonetic representation of the string of characters based on the grapheme representation that was determined.
Opening claim text (preview).
The invention claimed is: 1. A method, comprising: generating a string of characters based on a set of rules; parsing the string of characters into a first string of graphemes; adding one or more characters to the first string of graphemes to represent missing characters in the string of characters to create a second string of graphemes; grouping the second string of graphemes into a plurality of pseudo-graphemes, wherein two or more graphemes in the second string of graphemes that are phonetized together are grouped to a single pseudo-grapheme; accessing a first data structure that maps each pseudo-grapheme in the plurality of pseudo-graphemes to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plurality of first nodes with each first node of the plurality of first nodes having a respective weight assigned that corresponds to a pronunciation of a first grapheme; determining one or more phonetic representations for each pseudo-grapheme in the plurality of pseudo-graphemes based on the first data structure; accessing a second data structure that maps the one or more universal phonetic representations to one or more graphemes in a third string of graphemes, wherein the second data structure comprises a plurality of second nodes with each second node of the plurality of second nodes having a respective weight assigned that corresponds to a likely representation of a second grapheme; determining at least one grapheme representation for one or more of the one or more phonetic representations based on the second data structure; constructing a second phonetic representation of the string of characters based on the at least one grapheme representation that was determined; providing the second phonetic representation to a domain name verifier to determine that the phonetic representation is available to be registered as a domain name; and providing an offer to a user to register the second phonetic representation with a domain name system. 2. The method of claim 1 , further comprising: ranking each grapheme representation to produce a ranked list, wherein the ranking is based on a likelihood that a grapheme representation sounds similar to a pronunciation sound of the string of characters; and filtering the ranked list to produce a subset of grapheme representations. 3. The method of claim 2 , further comprising determining a first composite weight for the one or more phonetic representations based on the first data structure. 4. The method of claim 2 , further comprising determining a second composite weight for the one or more graphemes based on the second data structure. 5. The method of claim 4 , wherein the filtering is based on the second composite weight. 6. The method of claim 1 , further comprising creating the first data structure and the second data structure as information gain trees. 7. The method of claim 1 , wherein the set of rules includes at least one of a length of the string of characters, at least one character in the string of characters, and a position of at least one character in the string of characters. 8. A device, comprising: a memory storing instructions; and at least one processor, operably connected to the memory, implemented at least in part in hardware, and configured to execute the instructions to perform operations comprising: generating a string of characters based on a set of rules; parsing the string of characters into a first string of graphemes; adding one or more characters to the first string of graphemes to represent missing characters in the string of characters to create a second string of graphemes; grouping the second string of graphemes into a plurality of pseudo-graphemes, wherein two or more graphemes in the second string of graphemes that are phonetized together are grouped to a single pseudo-grapheme; accessing a first data structure that maps each pseudo-grapheme in the plurality of pseudo-graphemes to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plurality of first nodes with each first node of the plurality of first nodes having a respective weight assigned that corresponds to a likely pronunciation of a first grapheme; determining one or more phonetic representations for each pseudo-grapheme in the plurality of pseudo-graphemes based on the first data structure; accessing a second data structure that maps the one or more universal phonetic representations to one or more graphemes in a third string of graphemes, wherein the second data structure comprises a plurality of second nodes with each second node of the plurality of second nodes having a respective weight assigned that corresponds to a likely representation of a second grapheme; determining at least one grapheme representation for one or more of the one or more phonetic representation based on the second data structure; constructing a second phonetic representation of the string of characters based on the at least one grapheme representation that was determined; providing the second phonetic representation to a domain name verifier to determine that the phonetic representation is available to be registered as a domain name; and providing an offer to a user to register the second phonetic representation with a domain name system. 9. The device of claim 8 , the operations further comprising: ranking each grapheme representation to produce a ranked list, wherein the ranking is based on a likelihood that a grapheme representation sounds similar to a pronunciation sound of the string of characters; and filtering the ranked list to produce a subset of grapheme representations. 10. The device of claim 8 , the operations further comprising creating the first data structure and the second data structure as information gain trees. 11. The device of claim 8 , the operations further comprising determining a first composite weight for the one or more phonetic representations based on the first data structure. 12. The device of claim 8 , further comprising determining a second composite weight for the one or more graphemes based on the second data structure. 13. The device of claim 12 , wherein the filtering is based on the second composite weight. 14. The device of claim 8 , wherein the set of rules includes at least one of a length of the string of characters, at least one character in the string of characters, and a position of at least one character in the string of characters. 15. A non-transitory computer-readable medium comprising computer-interpretable instructions which, when executed by at least one electronic processor, cause the at least one electronic processor to perform a method of converting a string of characters into a phonetic representation, the method comprising: generating a string of characters based on a set of rules; parsing the string of characters into a first string of graphemes; adding one or more characters to the first string of graphemes to represent missing characters in the string of characters to create a second string of graphemes; grouping the second string of graphemes into a plurality of pseudo-graphemes, wherein two or more graphemes in the second string of graphemes that are phonetized together are grouped to a single pseudo-grapheme; accessing a first data structure that maps each pseudo-grapheme in the plurality of pseudo-graphemes to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plu
Character encoding · CPC title
Domain name generation or assignment · CPC title
Administrative registration, e.g. for domain names at internet corporation for assigned names and numbers [ICANN] · CPC title
using dictionaries or tables · CPC title
Parsing · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.