Machine map label translation
US-2016364384-A1 · Dec 15, 2016 · US
US10133737B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10133737-B2 |
| Application number | US-201113818869-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 26, 2011 |
| Priority date | Aug 26, 2010 |
| Publication date | Nov 20, 2018 |
| Grant date | Nov 20, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for transforming text strings. In general, one aspect of the subject matter described in this specification can be embodied in methods that include the actions of receiving input string having a plurality of terms, the input string being in a first form; transforming the input string from the first form to a second form including: applying one or more rules to the input string to identify one or more terms for translation, the one or more identified terms being fewer than the plurality of terms, translating the identified one or more terms to one or more translated terms in the second form, and transliterating the remaining terms of the plurality of terms into transliterated terms in the second form; and concatenating the translated and transliterated terms to form a hybrid output string in the second form.
Opening claim text (preview).
What is claimed is: 1. A method performed by data processing apparatus, the method comprising: receiving, with one or more processors at a server, an input string having a plurality of terms, the input string being in a first form, wherein a given sequence of the plurality of terms refers to a geographic feature, wherein the given sequence is annotated with a geographic-feature type selected from a plurality of geographic-feature types each indicating a characteristic of an entity in the physical world corresponding to the geographic feature, and wherein the given sequence is stored in an annotated format with the geographic-feature type in a database of geographic labels prior to receiving the input string; transforming, with one or more processors, the input string from the first form to a second form, the transforming including: applying one or more rules to the input string to identify one or more terms for translation, the one or more identified terms being fewer than the plurality of terms, wherein at least some of the rules are applied in response to a match between a feature type of the respective rule and the geographic-feature type with which the given sequence is annotated to indicate the characteristic of the corresponding entity in the physical world, translating the identified one or more terms to one or more translated terms in the second form, and transliterating at least some of the remaining terms of the plurality of terms into transliterated terms in the second form, including selecting one or more transliteration rules for application in accordance with the indicated characteristic of the entity, wherein when the given sequence of the plurality of terms is annotated with a first geographic-feature type of the plurality of geographic-feature types, a first rule of the one or more rules identifies a specific term in the input string for translation in response to a match between a first feature type of the first rule and the first geographic-feature type, and when the given sequence of the plurality of terms is annotated with a second geographic-feature type of the plurality of geographic-feature types, a second rule of the one or more rules identifies the same specific term in the input string for transliteration in response to a match between a second feature type of the second rule and the second geographic-feature type, wherein the specific term is translated or transliterated to a term having the same grammatical form as the specific term; the method further comprising: concatenating, with one or more processors, at least the translated and transliterated terms to form a hybrid output string in the second form; and storing the hybrid output string in the database of geographic labels; and when a map of a geographic region including the entity is requested for display: (i) retrieving the hybrid output string from the database and (ii) providing, via a network interface, the hybrid output string along with map data for presenting the map with the hybrid output string at a client device. 2. The method of claim 1 , wherein the first form and the second form are a first writing system and a second writing system respectively. 3. The method of claim 1 , wherein the first form and the second form are a first natural language and a second natural language respectively. 4. The method of claim 1 , wherein applying one or more rules to the input string comprises: for rules that match the geographic-feature type of the given sequence, determining whether the input string matches a string pattern of the respective one or more matching rules. 5. The method of claim 1 , wherein each rule includes a number of respective rule outputs for respective output forms, the respective rule outputs including translated forms of terms identified by the respective rules. 6. The method of claim 1 , wherein a matching rule among the one or more rules identifies a first portion of the given sequence to be translated and a second portion of the given sequence to be transliterated. 7. The method of claim 1 , wherein transliterating the remaining terms includes: tokenizing the string into a plurality of tokens; transliterating each token from the first form to a second form; and concatenating at least the transliterated tokens in the second form to form a transliterated output string in the second form. 8. A system comprising: one or more computers operable to interact to perform operations comprising: receiving an input string having a plurality of terms, the input string being in a first form, wherein a given sequence of the plurality of terms refers to a geographic feature, wherein the given sequence is annotated with a geographic-feature type selected from a plurality of geographic-feature types each indicating a characteristic of an entity in the physical world corresponding to the geographic feature, and wherein the given sequence is stored in an annotated format with the geographic-feature type in a database of geographic labels prior to receiving the input string; transforming the input string from the first form to a second form, the transforming including: applying one or more rules to the input string to identify one or more terms for translation, the one or more identified terms being fewer than the plurality of terms, wherein at least some of the rules are applied in response to a match between a feature type of the respective rule and the geographic-feature type with which the given sequence is annotated to indicate the characteristic of the corresponding entity in the physical world, translating the identified one or more terms to one or more translated terms in the second form, and transliterating at least some of the remaining terms of the plurality of terms into transliterated terms in the second form, including selecting one or more transliteration rules for application in accordance with the indicated characteristic of the entity, wherein when the given sequence of the plurality of terms is annotated with a first geographic-feature type of the plurality of geographic-feature, a first rule of the one or more rules identifies a specific term in the input string for translation in response to a match between a first feature type of the first rule and the first geographic-feature type, and when the given sequence of the plurality of terms is annotated with a second geographic-feature type of the plurality of geographic-feature types, a second rule of the one or more rules identifies the same specific term in the input string for transliteration in response to a match between a second feature type of the second rule and the second geographic-feature type, wherein the specific term is translated or transliterated to a term having the same grammatical form as the specific term; concatenating at least the translated and transliterated terms to form a hybrid output string in the second form; and storing the hybrid output string in the database of geographic labels; and when map data for a geographic region including the entity is requested for display: (i) retrieving the hybrid output string from the database and (ii) providing, via a network interface, the hybrid output string along with the requested map data for display at a client device. 9. The system of claim 8 , wherein the first form and the second form are a first writing system and a second writing system respectively. 10. The system of claim 8 , wherein the first form and the second form are a first natural language and a second natural language respectively. 11. The system of claim 8 , wherein applying one or more rules to the input string comprises: for rules that match the geographic-feature type of the given sequenc
Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title
Natural language generation · CPC title
Handling non-Latin characters, e.g. kana-to-kanji conversion · CPC title
Rule-based translation · CPC title
Morphological analysis · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.