Systems and methods for determining translation accuracy in multi-user multi-lingual communications

US2016232156A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016232156-A1
Application numberUS-201615099405-A
CountryUS
Kind codeA1
Filing dateApr 14, 2016
Priority dateFeb 8, 2013
Publication dateAug 11, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various embodiments described herein facilitate multi-lingual communications. The systems and methods of some embodiments enable multi-lingual communications through different modes of communication including, for example, Internet-based chat, e-mail, text-based mobile phone communications, postings to online forums, postings to online social media services, and the like. Certain embodiments implement communication systems and methods that translate text between two or more languages. Users of the systems and methods may be incentivized to submit corrections for inaccurate or erroneous translations, and may receive a reward for these submissions. Systems and methods for assessing the accuracy of translations are described.

First claim

Opening claim text (preview).

1 . A method comprising: performing by one or more computer processors: obtaining a translation of a message in a first language to a second language wherein the translation was submitted by a user; generating a part-of-speech (POS) n-gram representation of the translation comprising a plurality of POS n-grams of two or more different lengths, wherein generating the POS n-gram representation comprises associating a respective probability with each of the POS n-grams; and determining an accuracy of the translation based on a combination of the probabilities. 2 . The method of claim 1 wherein associating a respective probability with each of the POS n-grams comprises: determining the probability as a ratio of a count of occurrences of the POS n-gram in a corpus for the second language to a count of all POS n-grams in the corpus having a same length as the POS n-gram. 3 . The method of claim 2 wherein the corpus comprises chatspeak. 4 . The method of claim 1 wherein determining the accuracy of the translation based on a combination of the probabilities comprises: linearly interpolating the probabilities to generate the accuracy. 5 . The method of claim 4 wherein linearly interpolating the probabilities to generate the accuracy comprises: weighting probabilities of the POS n-grams according to their lengths wherein POS n-grams having lengths greater than other POS n-grams are weighted more than the other POS n-grams. 6 . The method of claim 4 , further comprising: excluding from the interpolation probabilities that do not exceed a threshold. 7 . The method of claim 1 wherein generating the POS n-gram representation of the translation comprises: generating a first POS n-gram from the translation of a first length; calculating a probability of the first POS n-gram; and determining that the probability of the first POS n-gram does not exceed a threshold and, based thereon, breaking the first POS n-gram into two or more second POS n-grams having lengths that are less than the first length. 8 . The method of claim 1 , further comprising: revoking the user's translation privileges if the accuracy falls below a threshold value. 9 . A system comprising: a non-transitory computer readable medium having instructions stored thereon; and at least one processor configured to execute the instructions to perform operations comprising: obtaining a translation of a message in a first language to a second language wherein the translation was submitted by a user; generating a part-of-speech (POS) n-gram representation of the translation comprising a plurality of POS n-grams of two or more different lengths, wherein generating the POS n-gram representation comprises associating a respective probability with each of the POS n-grams; and determining an accuracy of the translation based on a combination of the probabilities. 10 . The system of claim 9 wherein associating a respective probability with each of the POS n-grams comprises: determining the probability as a ratio of a count of occurrences of the POS n-gram in a corpus for the second language to a count of all POS n-grams in the corpus having a same length as the POS n-gram. 11 . The system of claim 10 wherein the corpus comprises chatspeak. 12 . The system of claim 9 wherein determining the accuracy of the translation based on a combination of the probabilities comprises: linearly interpolating the probabilities to generate the accuracy. 13 . The system of claim 12 wherein linearly interpolating the probabilities to generate the accuracy comprises: weighting probabilities of the POS n-grams according to their lengths wherein POS n-grams having lengths greater than other POS n-grams are weighted more than the other POS n-grams. 14 . The system of claim 12 , further comprising: excluding from the interpolation probabilities that do not exceed a threshold. 15 . The system of claim 9 wherein generating the POS n-gram representation of the translation comprises: generating a first POS n-gram from the translation of a first length; calculating a probability of the first POS n-gram; and determining that the probability of the first POS n-gram does not exceed a threshold and, based thereon, breaking the first POS n-gram into two or more second POS n-grams having lengths that are less than the first length. 16 . The system of claim 9 , wherein the operations further comprise: revoking the user's translation privileges if the accuracy falls below a threshold value. 17 . A manufacture comprising: non-transitory computer readable media comprising executable instructions, the executable instructions being executable by one or more processors to perform operations comprising: obtaining a translation of a message in a first language to a second language wherein the translation was submitted by a user; generating a part-of-speech (POS) n-gram representation of the translation comprising a plurality of POS n-grams of two or more different lengths, wherein generating the POS n-gram representation comprises associating a respective probability with each of the POS n-grams; and determining an accuracy of the translation based on a combination of the probabilities. 18 . The manufacture of claim 17 wherein associating a respective probability with each of the POS n-grams comprises: determining the probability as a ratio of a count of occurrences of the POS n-gram in a corpus for the second language to a count of all POS n-grams in the corpus having a same length as the POS n-gram. 19 . The manufacture of claim 18 wherein the corpus comprises chatspeak. 20 . The manufacture of claim 17 wherein determining the accuracy of the translation based on a combination of the probabilities comprises: linearly interpolating the probabilities to generate the accuracy. 21 . The manufacture of claim 20 wherein linearly interpolating the probabilities to generate the accuracy comprises: weighting probabilities of the POS n-grams according to their lengths wherein POS n-grams having lengths greater than other POS n-grams are weighted more than the other POS n-grams. 22 . The manufacture of claim 20 , further comprising: excluding from the interpolation probabilities that do not exceed a threshold. 23 . The manufacture of claim 17 wherein generating the POS n-gram representation of the translation comprises: generating a first POS n-gram from the translation of a first length; calculating a probability of the first POS n-gram; and determining that the probability of the first POS n-gram does not exceed a threshold and, based thereon, breaking the first POS n-gram into two or more second POS n-grams having lengths that are less than the first length. 24 . The manufacture of claim 17 , wherein the operations further comprise: revoking the user's translation privileges if the accuracy falls below a threshold value.

Assignees

Inventors

Classifications

  • Business processes related to social networking or social networking services · CPC title

  • Orthographic correction, e.g. spell checking or vowelisation · CPC title

  • G06F40/51Primary

    Translation evaluation · CPC title

  • Grammatical analysis; Style critique · CPC title

  • Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016232156A1 cover?
Various embodiments described herein facilitate multi-lingual communications. The systems and methods of some embodiments enable multi-lingual communications through different modes of communication including, for example, Internet-based chat, e-mail, text-based mobile phone communications, postings to online forums, postings to online social media services, and the like. Certain embodiments im…
Who is the assignee on this patent?
Machine Zone Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/51. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 11 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).