Automatic interpretation method and apparatus

US10867136B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10867136-B2
Application numberUS-201715404941-A
CountryUS
Kind codeB2
Filing dateJan 12, 2017
Priority dateJul 7, 2016
Publication dateDec 15, 2020
Grant dateDec 15, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Provided is an automated interpretation method, apparatus, and system. The automated interpretation method includes encoding a voice signal in a first language to generate a first feature vector, decoding the first feature vector to generate a first language sentence in the first language, encoding the first language sentence to generate a second feature vector with respect to a second language, decoding the second feature vector to generate a second language sentence in the second language, controlling a generating of a candidate sentence list based on any one or any combination of the first feature vector, the first language sentence, the second feature vector, and the second language sentence, and selecting, from the candidate sentence list, a final second language sentence as a translation of the voice signal.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor-implemented interpretation method for translating a sentence from a first language to a second language, the first language and the second language being natural languages, the method comprising: receiving a voice signal as an input, the voice signal being an input source sentence uttered in the first language; encoding, using a voice recognition encoder, the voice signal by extracting abstracted voice information from the voice signal to generate a first feature vector using the abstracted voice information, the abstracted voice information being stored in a database; providing the first feature vector to an input layer of a voice recognition decoder; decoding, using the voice recognition decoder, the first feature vector to generate a first language sentence in the first language, the first language sentence being stored in the database and corresponding to the abstracted voice information; encoding, using a machine translation encoder, the first language sentence by extracting abstracted sentence information from the first language sentence to generate a second feature vector with respect to the second language using the abstracted sentence information, the abstracted sentence information being stored in the database and corresponding to the abstracted voice information; providing the second feature vector to an input layer of a machine translation decoder; decoding, using the machine translation decoder, the second feature vector to generate a second language sentence in the second language; generating a candidate sentence list to include the second language sentence and one or more previous translation final second language sentences, being retrieved from the database, based on any one or any combination of the first feature vector, the first language sentence, and the second feature vector, wherein the one or more previous translation final second language sentences are previously generated, based on respective previous second feature vectors, through previous recognition-translation processes performed using the machine translation decoder, and are previously stored in the database, wherein the one or more previous translation final second language sentences are determined to be similar to the input source sentence based on a comparison of the abstracted sentence information of the input source sentence and abstracted sentence information of plural previous translation final second language sentences previously stored in the database, and wherein the voice recognition encoder, the voice recognition decoder, the machine translation encoder, and the machine translation decoder are implemented in neural networks being trained; and selecting, from the candidate sentence list, a final second language sentence as a translation of the input source sentence corresponding to the input voice signal, wherein the generating of the candidate sentence list includes: acquiring a first interpretation result matching a first language feature vector, from the database, determined similar to the first feature vector; acquiring a second interpretation result matching a previous recognized sentence, from the database, determined similar to the first language sentence; and acquiring a third interpretation result matching a second language feature vector, from the database, determined similar to the second feature vector, and wherein the generating of the candidate sentence list further includes adding any of previous translation sentences corresponding to any of the first interpretation result, the second interpretation result, and the third interpretation result to the candidate sentence list. 2. The method of claim 1 , wherein the generating of the candidate sentence list includes acquiring a candidate sentence, from the database, determined to correspond to any one or any combination of the first feature vector, the first language sentence, and the second feature vector from the database. 3. The method of claim 2 , wherein the acquiring of the candidate sentence includes retrieving respective elements determined similar to any of the first feature vector, the first language sentence, and the second feature vector from a plurality of elements stored in the database based on one or more approximate nearest neighbor (NN) algorithms. 4. The method of claim 1 , wherein the acquiring of the second interpretation result includes: converting the first language sentence into a vector; and determining which of plural previous recognized sentences, from the database, are similar to the first language sentence based on the vector. 5. The method of claim 1 , wherein the selecting of the final second language sentence includes: calculating scores of candidate sentences included in the candidate sentence list based on the second feature vector; and selecting a candidate sentence, from the candidate sentence list, having a highest of the calculated scores to be the final second language sentence. 6. The method of claim 1 , wherein the generating of the first feature vector includes: sampling the voice signal in the first language based on a predetermined frame length; generating respective input vectors corresponding to frames; sequentially inputting the respective input vectors to the voice recognition encoder used for voice recognition; and determining the first feature vector to be an output from the voice recognition encoder for the sequentially input respective input vectors. 7. The method of claim 1 , wherein the generating of the first language sentence includes: inputting the first feature vector to another voice recognition decoder used for voice recognition; generating a predetermined number of sentence sequences based on probabilities of sub-words sequentially output from the other decoder; and selecting a sentence sequence having a highest score among the predetermined number of sentence sequences to be the first language sentence. 8. The method of claim 1 , wherein the generating of the second feature vector includes: dividing the first language sentence into a plurality of sub-words; sequentially inputting input vectors respectively indicating the plurality of sub-words to the machine translation encoder used for machine translation; and determining the second feature vector to be an output from the machine translation encoder for the sequentially input input vectors. 9. The method of claim 1 , further comprising: storing the first feature vector, the first language sentence, and the second feature vector in the database; and storing any one or any combination of the second language sentence and the final second language sentence corresponding to the first feature vector, the first language sentence, and the second feature vector in the database. 10. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method of claim 1 . 11. The method of claim 1 , wherein the controlling of the generating of the candidate sentence list to include the second language sentence and the one or more previous translation final second language sentences is further based the second language sentence. 12. The method of claim 1 , wherein the generating of the candidate sentence list includes any one or any combination of: acquiring a first interpretation result matching a first language feature vector, from the database, determined similar to the first feature vector; acquiring a second interpretation result matching a second language feature vector, from the database, determined similar to the second feature vector, and adding any of previous fin

Assignees

Inventors

Classifications

  • Vocoder architecture · CPC title

  • Neural networks · CPC title

  • Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation · CPC title

  • Speech synthesis; Text to speech systems · CPC title

  • Speech to text systems (G10L15/08 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10867136B2 cover?
Provided is an automated interpretation method, apparatus, and system. The automated interpretation method includes encoding a voice signal in a first language to generate a first feature vector, decoding the first feature vector to generate a first language sentence in the first language, encoding the first language sentence to generate a second feature vector with respect to a second language…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F40/51. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 15 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).