What technology area does this patent fall under?

Primary CPC classification G06F40/216. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 17 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Language model translation and training method and apparatus

US10509864B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10509864-B2
Application number	US-201815947915-A
Country	US
Kind code	B2
Filing date	Apr 9, 2018
Priority date	Nov 30, 2017
Publication date	Dec 17, 2019
Grant date	Dec 17, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A language model training method and an apparatus using the language model training method are disclosed. The language model training method includes assigning a context vector to a target translation vector, obtaining feature vectors based on the target translation vector and the context vector, generating a representative vector representing the target translation vector using an attention mechanism for the feature vectors, and training a language model based on the target translation vector, the context vector, and the representative vector.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor implemented language model training method, comprising: assigning a context vector to a target translation vector; obtaining feature vectors based on the target translation vector and the context vector; generating a representative vector representing the target translation vector using an attention mechanism for the feature vectors; and training a language model based on the target translation vector, the context vector, and the representative vector. 2. The method of claim 1 , wherein the assigning of the context vector comprises: obtaining the target translation vector by preprocessing a target translation sentence to be translated. 3. The method of claim 2 , wherein the obtaining of the target translation vector comprises: obtaining the target translation sentence using speech recognition. 4. The method of wherein the assigning of the context vector comprises: assigning the context vector to the target translation vector for each word. 5. The method of claim 1 , wherein the obtaining of the feature vectors, comprises: obtaining the feature vectors by performing character embed ding, on the target translation vector and the context vector. 6. The method of aim 1 , wherein the generating of the representative vector comprises: obtaining a correlation among characters in the target translation vector by performing positional encoding on the feature vectors; and generating the representative vector based on the obtained correlation. 7. The method of claim 1 , wherein he generating of the representative vector comprises: generating the representative vector using forward estimation or backward estimation for the feature vectors. 8. The method of claim 7 , wherein the forward estimation comprises an estimation of which character follows a first character included in the feature vectors, and the backward estimation comprises an estimation of which character follows a second character included in the feature vectors. 9. The method of claim 1 , wherein the language model is based on a recurrent neural network (RNN) of a hierarchical structure. 10. The method of claim 9 , wherein the training of the language model comprises: updating a connection weight included in the RNN based on the target translation vector, the context vector, and the representative vector. 11. A language model training apparatus, comprising: a preprocessor configured to assign a context vector to a target translation vector; and a processor configured to: obtain feature vectors based on the target translation vector and the context vector, generate a representative vector representing the target translation vector using an attention mechanism for the feature vectors, and train a language model based on the target translation vector, the context vector, and the representative vector. 12. The language model training apparatus of claim 11 , wherein the preprocessor is further configured to obtain the target translation vector by preprocessing a target translation sentence to be translated. 13. The language model training apparatus of claim 12 , wherein the preprocessor is further configured to obtain the target, translation sentence using speech recognition. 14. The language model training apparatus of claim 11 , wherein the preprocessor is further configured to assign the context vector to the target translation vector for each word. 15. The language model training apparatus of claim 11 , further comprising a memory storing instructions, which when executed by the processor, cause the processor to perform the obtaining of the feature vectors based on the target translation vector and the context vector, perform the generation of the representative vector representing the target translation vector using the attention mechanism for the feature vectors, and perform the training of the language model based on the target translation vector, the context vector, and the representative vector. 16. The language model training apparatus of claim 11 , wherein the processor comprises: a language model trainer configured to: obtain the feature vectors based on the target translation vector and the context vector; generate the representative vector representing the target translation vector using the attention mechanism for the feature vectors; and train the language model based on the target translation vector, the context vector, and the, representative vector. 17. The language model training apparatus of claim 16 , wherein the language model trainer is further configured to obtain the feature vectors by performing character embedding on the target translation vector and the context vector. 18. The language model training apparatus of claim 16 , wherein the language model trainer is further configured to obtain a correlation among characters in the target translation vector by performing positional encoding on the feature vectors, and generate the representative vector based on the obtained correlation. 19. The language model training apparatus of claim 16 , wherein the language model trainer is further configured to generate the representative vector using forward estimation or backward estimation for the feature vectors. 20. The language model training apparatus of claim 19 , wherein the forward estimation comprises an estimation of which character follows a first character included in the feature vectors, and the backward estimation comprises an estimation of which character follows a second character included in the feature vectors. 21. The language model training apparatus of claim 16 , wherein the language model is based on a recurrent neural network (RNN) of a hierarchical structure. 22. The language model training apparatus of claim 21 , wherein the language model trainer is further configured to update a connection weight included in the RNN based on the target translation vector, the context vector, and the representative vector. 23. The method of claim 1 , wherein respective word-unit target translation vectors are generated for each word included in a target sentence. 24. The method of claim 1 , wherein each of the feature vectors is a vector corresponding to abstracted speech information. 25. The method of claim 24 , wherein the context vector is a query vector, and the attention mechanism comprises an attention function that maps the query vector to an output vector.

Assignees

Samsung Electronics Co Ltd

Inventors

Classifications

G06F40/216Primary
using statistical methods · CPC title
G06F40/42
Data-driven translation · CPC title
G06F18/214
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06F40/44
Statistical methods, e.g. probability models · CPC title

Patent family

Related publications grouped by family.

View patent family 66632434

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10509864B2 cover?: A language model training method and an apparatus using the language model training method are disclosed. The language model training method includes assigning a context vector to a target translation vector, obtaining feature vectors based on the target translation vector and the context vector, generating a representative vector representing the target translation vector using an attention me…
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06F40/216. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 17 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).