What technology area does this patent fall under?

Primary CPC classification G06F40/58. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Active error detection and resolution for linguistic translation

US9710463B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9710463-B2
Application number	US-201314099079-A
Country	US
Kind code	B2
Filing date	Dec 6, 2013
Priority date	Dec 6, 2012
Publication date	Jul 18, 2017
Grant date	Jul 18, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A two-way speech-to-speech (S2S) translation system actively detects a wide variety of common error types and resolves them through user-friendly dialog with the user(s). Examples include features including one or more of detecting out-of-vocabulary (OOV) named entities and terms, sensing ambiguities, homophones, idioms, ill-formed input, etc. and interactive strategies for recovering from such errors. In some examples, different error types are prioritized and systems implementing the approach can include an extensible architecture for implementing these decisions.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for linguistic processing for speech-to-speech translation, the method comprising: receiving a linguistic input comprising a sequence of words in a first language from a first user, the linguistic input comprising a first audio input including a speech utterance by the first user; determining a first data representation of the linguistic input; processing, using a computer-implemented analyzer, the first data representation to identify at least part of the data representation as being potentially associated with an error of processing of the linguistic input, wherein the processing comprises identifying said part as at least one characteristic of (a) including out-of-vocabulary (OOV) words, (b) representing a named entity, (c) including a homophone, (d) having an ambiguous word sense, and (e) including an idiom in the first language; performing further processing, using a computer-implemented recovery strategy processor, of the identified at least part of the first data representation to form a modified data representation of the linguistic input; using a machine translator to form a second data representation of the modified data representation; and processing the second data representation using the recovery strategy processor to refine the second data representation through at least one of automated processing or user-assisted processing; determining a linguistic output from the refined second data representation, the linguistic output comprising a sequence of words in a second language; and providing the linguistic output to a second user, the linguistic output comprising a synthesized second audio signal including speech output, wherein identifying said part as including an idiom in the first language comprises performing rule-based idiom expansion and performing statistical idiom detection. 2. The method of claim 1 wherein the first data representation comprises a text representation in the first language. 3. The method of claim 1 wherein the method comprises speech-to-speech translation, and wherein the linguistic input comprises a first audio input including a speech utterance by the first user and the linguistic output comprises a synthesized second audio signal including speech output. 4. The method of claim 3 wherein determining the first data representation of the linguistic input comprises recognizing, using a speech to text module of the computer, the speech utterance in the first audio signal to form the first text representation, and wherein determining the linguistic output comprises using a text to speech module of the computer to form the second audio signal from the modified data representation. 5. The method of claim 1 wherein performing the further processing includes selecting and performing a recovery strategy according the identified characteristic. 6. The method of claim 5 wherein performing the recovery strategy includes soliciting and receiving input for the recovery strategy from a user. 7. The method of claim 6 wherein the user from whom the input for the recovery strategy is solicited and received is the first user. 8. The method of claim 6 wherein performing the recovery strategy includes soliciting and receiving input for the recovery strategy from one or both of the first user and a second user to whom a linguistic output based on the second data representation is presented. 9. The method of claim 5 wherein performing the recovery strategy includes identifying a part of the first data representation with a corresponding part of the linguistic input and wherein forming the data representing the linguistic output comprises forming said data to transfer the part of the linguistic input to a linguistic output without translation. 10. The method of claim 9 wherein the method comprises a speech-to-speech translation system, and wherein the linguistic input comprises a first audio input signal including a speech utterance by the first user and the linguistic output comprises a synthesized second audio signal including speech output, and wherein the second audio signal further comprises a part of the audio input signal. 11. The method of claim 1 wherein performing the further processing includes performing a constrained linguistic translation of the linguistic input. 12. The method of claim 5 wherein performing the recovery strategy includes soliciting and receiving input for the recovery strategy from the first user for disambiguation of a homophone, ambiguous word sense, or an idiom in the first language. 13. The method of claim 1 , wherein processing the first data representation further comprises at least one of: tagging one or more unambiguous words over a span of words within the first data representation; tagging one or more ambiguous words over the span of words within the first data representation; and predicting the one or more tagged ambiguous words based, at least in part, upon the one or more tagged unambiguous words. 14. The method of claim 13 , wherein predicting the one or more tagged ambiguous words comprises one or more of: (i) determining a phrase pair associated with each of the one or more tagged ambiguous words; (ii) searching an inventory of source phrase keywords; (iii) determining a prediction based upon a supervised model; and (iv) receiving input from the first user. 15. The method of claim 1 , wherein performing rule-based idiom expansion comprises performing rule-based idiom expansion comprising performing pronoun expansion and performing verb expansion. 16. The method of claim 1 , wherein processing the first data representation further comprises identifying one or more incomplete utterances within the first data representation by identifying fragments with ungrammatical structure. 17. The method of claim 1 , wherein the processing further comprises identifying said part as at least two characteristics of (a) including out-of-vocabulary (OOV) words, (b) representing a named entity, (c) including a homophone, (d) having an ambiguous word sense, and (e) including an idiom in the first language. 18. Software stored on a non-transitory computer-readable medium comprising instructions for causing a computer processor to perform a linguistic processing for speech-to-speech translation including: receiving first data representing a linguistic input comprising a sequence of words in a first language from a first user, the linguistic input comprising a first audio input including a speech utterance by the first user; determining a first data representation of the linguistic input; processing, using a computer-implemented analyzer, the first data representation to identify at least part of the data representation as being potentially associated with an error of processing of the linguistic input; performing further processing, using a computer-implemented recovery strategy processor, of the identified as least part of the first data representation to form a modified data representation of the linguistic input, wherein the processing comprises identifying said part as at least one characteristic of (a) including out-of-vocabulary (OOV) words, (b) representing a named entity, (c) including a homophone, (d) having an ambiguous word sense, and (e) including an idiom in the first language; using a machine translator to form a second data representation of the modified data representation; processing the second data representation using the recovery strategy processor to refine the second data representation through at least one of automated proce

Assignees

Raytheon Bbn Technologies Corp

Inventors

Classifications

G06F40/58Primary
Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation · CPC title
G10L15/01Primary
Assessment or evaluation of speech recognition systems · CPC title
G10L2015/225
Feedback of the input speech · CPC title
G06F17/289Primary
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 50980356

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9710463B2 cover?: A two-way speech-to-speech (S2S) translation system actively detects a wide variety of common error types and resolves them through user-friendly dialog with the user(s). Examples include features including one or more of detecting out-of-vocabulary (OOV) named entities and terms, sensing ambiguities, homophones, idioms, ill-formed input, etc. and interactive strategies for recovering from such…
Who is the assignee on this patent?: Raytheon Bbn Technologies Corp
What technology area does this patent fall under?: Primary CPC classification G06F40/58. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).