Machine translation using global lexical selection and sentence reconstruction

US9323745B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9323745-B2
Application numberUS-201414336297-A
CountryUS
Kind codeB2
Filing dateJul 21, 2014
Priority dateMar 15, 2007
Publication dateApr 26, 2016
Grant dateApr 26, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed are systems, methods, and computer-readable media for performing translations from a source language to a target language. The method comprises receiving a source phrase, generating a target bag of words based on a global lexical selection of words that loosely couples the source words/phrases and target words/phrases, and reconstructing a target phrase or sentence by considering all permutations of words with a conditional probability greater than a threshold.

First claim

Opening claim text (preview).

We claim: 1. A method comprising: classifying, via a processor, a source phrase in a source language into a phrase meaning; matching, via the processor, the phrase meaning to a target phrase automaton in a target language wherein: the target phrase automaton comprises a plurality of states and a plurality of arcs interconnecting the plurality of states to define a sentence path, each of the plurality of states defining a word position slot for inserting a word from a bag of words, and each of the plurality of arcs associated with a pre-defined insertion cost for inserting the word from the bag of words based on a previous state in the sentence path; and the pre-defined insertion cost is associated with one of penalizing and rewarding the sentence path based on how many words are in the sentence path, in order to produce more words in a target sentence relative to the source phrase; determining, via the processor, a target sentence probability for the sentence path based on one of a lexical translation of words in the source phrase and a phrase-to-phrase mapping; and upon determining that the sentence path has a probability above a threshold, constructing, via the processor, the target sentence using the sentence path. 2. The method of claim 1 , wherein the target sentence probability is further based on a target word possibility for each word position, the target word possibility for each word position weighted by a target language model. 3. The method of claim 2 , wherein the target word possibility for each word position is detected independently. 4. The method of claim 2 , wherein the target word possibility for each word does not use information about previous words and subsequent words. 5. The method of claim 1 , further comprising: adjusting a length of the target sentence by adding optional deletions when constructing the target sentence. 6. The method of claim 1 , wherein function words in the target sentence possibility serve as attributes on contentful lexical items. 7. The method of claim 6 , wherein the attributes are one of definiteness, tenses and case. 8. A system comprising: a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising: classifying a source phrase in a source language into a phrase meaning; matching the phrase meaning to a target phrase automaton in a target language wherein: the target phrase automaton comprising a plurality of states and a plurality of arcs interconnecting the plurality of states to define a sentence path, each of the plurality of states defining a word position slot for inserting a word from a bag of words, and each of the plurality of arcs associated with a pre-defined insertion cost for inserting the word from the bag of words based on a previous state in the sentence path; and the pre-defined insertion cost is associated with one of penalizing and rewarding the sentence path based on how many words are in the sentence path, in order to produce more words in a target sentence relative to the source phrase; determining a target sentence probability for the sentence path based on one of a lexical translation of words in the source phrase and a phrase-to-phrase mapping; and upon determining that the sentence path has a probability above a threshold, constructing the target sentence using the sentence path. 9. The system of claim 8 , wherein the target sentence probability is further based on a target word possibility for each word position, the target word possibility for each word position weighted by a target language model. 10. The system of claim 9 , wherein the target word possibility for each word position is detected independently. 11. The system of claim 9 , wherein the target word possibility for each word does not use information about previous words and subsequent words. 12. The system of claim 8 , the computer-readable storage medium having additional instructions stored which result in operations comprising: adjusting a length of the target sentence by adding optional deletions when constructing the target sentence. 13. The system of claim 8 , wherein function words in the target sentence possibility serve as attributes on contentful lexical items. 14. The system of claim 13 , wherein the attributes are one of definiteness, tenses and case. 15. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising: classifying a source phrase in a source language into a phrase meaning; matching the phrase meaning to a target phrase automaton in a target language wherein: the target phrase automaton comprising a plurality of states and a plurality of arcs interconnecting the plurality of states to define a sentence path, each of the plurality of states defining a word position slot for inserting a word from a bag of words, and each of the plurality of arcs associated with a pre-defined insertion cost for inserting the word from the bag of words based on a previous state in the sentence path; and the pre-defined insertion cost is associated with one of penalizing and rewarding the sentence path based on how many words are in the sentence path, in order to produce more words in a target sentence relative to the source phrase; determining a target sentence probability for the sentence path based on one of a lexical translation of words in the source phrase and a phrase-to-phrase mapping; and upon determining that the sentence path has a probability above a threshold, constructing the target sentence using the sentence path. 16. The computer-readable storage device of claim 8 , wherein the target sentence probability is further based on a target word possibility for each word position, the target word possibility for each word position weighted by a target language model. 17. The computer-readable storage device of claim 9 , wherein the target word possibility for each word position is detected independently. 18. The computer-readable storage device of claim 9 , wherein the target word possibility for each word does not use information about previous words and subsequent words. 19. The computer-readable storage device of claim 8 , having additional instructions stored which result in operations comprising: adjusting a length of the target sentence by adding optional deletions when constructing the target sentence. 20. The computer-readable storage device of claim 8 , wherein function words in the target sentence possibility serve as attributes on contentful lexical items.

Assignees

Inventors

Classifications

  • G06F40/44Primary

    Statistical methods, e.g. probability models · CPC title

  • Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities · CPC title

  • Processing of additional data, e.g. scrambling of additional data or processing content descriptors · CPC title

  • involving client display capabilities, e.g. screen resolution of a mobile phone (optimising the visualisation of content during browsing in the Internet G06F16/9577; processing of terminal status or physical abilities in wireless networks H04W8/22; authentication in wireless network security H04W12/06) · CPC title

  • Example-based machine translation; Alignment · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9323745B2 cover?
Disclosed are systems, methods, and computer-readable media for performing translations from a source language to a target language. The method comprises receiving a source phrase, generating a target bag of words based on a global lexical selection of words that loosely couples the source words/phrases and target words/phrases, and reconstructing a target phrase or sentence by considering all …
Who is the assignee on this patent?
At & T Ip Ii Lp
What technology area does this patent fall under?
Primary CPC classification G06F40/44. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 26 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).