What technology area does this patent fall under?

Primary CPC classification G06F40/40. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and apparatus with model training and/or sequence recognition

US11468324B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11468324-B2
Application number	US-202016831206-A
Country	US
Kind code	B2
Filing date	Mar 26, 2020
Priority date	Oct 14, 2019
Publication date	Oct 11, 2022
Grant date	Oct 11, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processor-implemented method includes: using an encoder, determining, for each of a plurality of tokens included in an input sequence, a self-attention weight based on a token and one or more tokens that precede the token in the input sequence; using the encoder, determining context information corresponding to the input sequence based on the determined self-attention weights; and using a decoder, determining an output sequence corresponding to the input sequence based on the determined context information.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor-implemented method comprising: using an encoder, determining, for each of a plurality of tokens included in an input sequence, a self-attention weight based on a token and one or more tokens that precede the token in the input sequence; using the encoder, determining context information corresponding to the input sequence based on the determined self-attention weights; and using a decoder, determining an output sequence corresponding to the input sequence based on the determined context information. 2. The method of claim 1 , further comprising training the encoder and the decoder based on the determined output sequence. 3. The method of claim 2 , wherein the determining of the self-attention weight comprises: masking token relationships between the token and each of tokens that follow the token in the input sequence; and determining the self-attention weight based on a result of the masking. 4. The method of claim 2 , wherein the determining of the self-attention weight comprises: determining the self-attention weight based on the token and each of a preset number of the tokens that precede the token in the input sequence. 5. The method of claim 2 , wherein the determining of the self-attention weight comprises: determining the self-attention weight using two or more of the tokens included in the input sequence. 6. The method of claim 2 , wherein the determining of the self-attention weight comprises: determining the self-attention weight based on the token and each of remaining tokens excluding a preset number of tokens among the tokens that precede the token in the input sequence. 7. The method of claim 2 , wherein the training of the encoder and the decoder comprises: training the encoder and the decoder such that a loss between a true sequence corresponding to the input sequence and the output sequence is less than or equal to a threshold. 8. The method of claim 2 , wherein the encoder and the decoder correspond to a transformer model. 9. The method of claim 2 , wherein either one or both of the input sequence or the output sequence is any one of speech data, sentence data, image data, biodata, and handwriting data. 10. A non-transitory computer-readable storage medium storing instructions that, when executed by one or more processors, configure the one or more processors to perform the method of claim 2 . 11. A processor-implemented method comprising: using an encoder, determining, each time a token included in an input sequence is input or obtained, a self-attention weight based on an input token and one or more tokens that precede the input token in the input sequence; determining context information corresponding to the currently input tokens based on the determined self-attention weight; and using a decoder, determining an output sequence corresponding to the currently input tokens based on the determined context information. 12. The method of claim 11 , wherein the determining of the self-attention weight comprises: masking token relationships between the token and each of tokens that follow the token among the currently input tokens; and determining the self-attention weight based on a result of the masking. 13. The method of claim 11 , wherein the determining of the context information comprises: updating the context information each time the token of the input sequence is input. 14. The method of claim 11 , wherein the determining of the self-attention weight comprises: determining the self-attention weight based on the token and each of a preset number of the tokens that precede the token among the currently input tokens. 15. The method of claim 11 , wherein the determining of the self-attention weight comprises: determining the self-attention weight using two or more tokens among the currently input tokens. 16. The method of claim 11 , wherein the determining of the self-attention weight comprises: determining the self-attention weight based on the token and each of remaining tokens excluding a preset number of tokens among the tokens that precede the token among the currently input tokens. 17. An apparatus comprising: one or more processors configured to: determine, for each of a plurality of tokens included in an input sequence, a self-attention weight based on a token and one or more tokens that precede the token in the input sequence; determine, context information corresponding to the input sequence based on the determined self-attention weight; and determine, an output sequence corresponding to the input sequence based on the determined context information. 18. The apparatus of claim 17 , wherein the one or more processors is configured to train, based on the determined output sequence, an encoder for the determining of the self-attention weight and the determining of the context information and a decoder for the determining of the output sequence. 19. The apparatus of claim 18 , wherein, for the determining of the self-attention weight, the one or more processors is configured to: mask token relationships between the token and each of tokens that follow the token in the input sequence; and determine the self-attention weight based on a result of the masking. 20. The apparatus of claim 18 , wherein, for the determining of the self-attention weight, the one or more processors is configured to: determine the self-attention weight based on the token and each of a preset number of tokens that precede the token in the input sequence. 21. The apparatus of claim 18 , wherein, for the determining of the self-attention weight, the one or more processors is configured to: determine the self-attention weight using two or more of the tokens included in the input sequence. 22. The apparatus of claim 18 , wherein, for the determining of the self-attention weight, the one or more processors is configured to: determine the self-attention weight based on the token and each of remaining tokens excluding a preset number of tokens among the tokens that precede the token in the input sequence.

Assignees

Samsung Electronics Co Ltd

Inventors

Lee Hodong

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/0455
Auto-encoder networks; Encoder-decoder networks · CPC title
G06N3/09
Supervised learning · CPC title
G06F40/40Primary
Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

Patent family

Related publications grouped by family.

View patent family 75382908

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11468324B2 cover?: A processor-implemented method includes: using an encoder, determining, for each of a plurality of tokens included in an input sequence, a self-attention weight based on a token and one or more tokens that precede the token in the input sequence; using the encoder, determining context information corresponding to the input sequence based on the determined self-attention weights; and using a dec…
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06F40/40. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).