Method and device with natural language processing

US2022092266A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022092266-A1
Application numberUS-202117186830-A
CountryUS
Kind codeA1
Filing dateFeb 26, 2021
Priority dateSep 23, 2020
Publication dateMar 24, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and device with natural language processing is disclosed. The method includes performing a word embedding of an input sentence, encoding a result of the word embedding, using an encoder of a natural language processing model, to generate a context embedding vector, decoding the context embedding vector, using a decoder of the natural language processing model, to generate an output sentence corresponding to the input sentence, generating a score indicating a relationship between the context embedding vector and each of a plurality of knowledge embedding vectors, determining a first loss based on the output sentence, determining a second loss based on the generated score, and performing training of the natural language processing model, including training the natural language processing model based on the determined first loss, and training the natural language processing model based on the determined second loss.

First claim

Opening claim text (preview).

What is claimed is: 1 . A processor-implemented method, the method comprising: performing a word embedding of an input sentence; encoding a result of the word embedding, using an encoder of a natural language processing model, to generate a context embedding vector; decoding the context embedding vector, using a decoder of the natural language processing model, to generate an output sentence corresponding to the input sentence; generating a score indicating a relationship between the context embedding vector and each of a plurality of knowledge embedding vectors; determining a first loss based on the output sentence; determining a second loss based on the generated score; and performing training of the natural language processing model, including: training the natural language processing model based on the determined first loss; and training the natural language processing model based on the determined second loss. 2 . The method of claim 1 , wherein the generating of the score comprises: generating a first score indicating a relation between the context embedding vector and a first knowledge embedding vector; and generating a second score indicating a relation between the context embedding vector and a second knowledge embedding vector. 3 . The method of claim 2 , wherein the first knowledge embedding vector is a knowledge embedding vector that represents a true knowledge of the input sentence, and the second knowledge embedding vector is a knowledge embedding vector that represents a false knowledge of the input sentence. 4 . The method of claim 2 , wherein the training of the natural language processing model based on the second loss comprises: determining the second loss using a difference between the first score and the second score; and training the word embedding and the encoder based on the second loss. 5 . The method of claim 4 , wherein the training of the word embedding and the encoder is performed to decrease the second loss. 6 . The method of claim 2 , wherein the generating of the first score comprises: generating a first concatenated vector by concatenating the context embedding vector and the first knowledge embedding vector; and generating the first score using the first concatenated vector and a score function, and wherein the generating of the second score comprises: generating a second concatenated vector by concatenating the context embedding vector and the second knowledge embedding vector; and generating the second score using the second concatenated vector and the score function. 7 . The method of claim 6 , wherein the score function is a neural network comprising a plurality of fully-connected layers. 8 . The method of claim 1 , wherein the training of the natural language processing model based on the second loss comprises: training the word embedding to include knowledge information associated with the input sentence in results of the word embedding, and training the encoder to include the knowledge information in context embedding vector results of the encoder. 9 . The method of claim 1 , further comprising: generating the knowledge embedding vectors using knowledge graph (KG) embedding. 10 . The method of claim 1 , wherein the training of the natural language processing model based on the determined first loss includes training the word embedding, the encoder, and the decoder based on the first loss, and wherein the training of the natural language processing model based on the determined second loss includes training the word embedding and the encoder based on the second loss. 11 . A non-transitory computer-readable storage medium storing instructions, which when executed by a processor, configure the processor to perform the method of claim 1 . 12 . A device, the device comprising: a memory storing a natural language processing model; and a processor configured to: perform a word embedding of an input sentence; encode a result of the word embedding, using an encoder of the natural language processing model, to generate a context embedding vector; decode the context embedding vector, using a decoder of the natural language processing model, to generate an output sentence corresponding to the input sentence; generate a score indicating a relationship between the context embedding vector and each of a plurality of knowledge embedding vectors; determine a first loss based on the output sentence; determine a second loss based on the generated score; and perform a training of the natural language processing model, including: a training of the natural language processing model based on the determined first loss; and a training of the natural language processing model based on the determined second loss. 13 . The device of claim 12 , wherein the processor is configured to: generate a first score indicating a relation between the context embedding vector and a first knowledge embedding vector; and generate a second score indicating a relation between the context embedding vector and a second knowledge embedding vector. 14 . The device of claim 13 , wherein the first knowledge embedding vector is a knowledge embedding vector that represents a true knowledge of the input sentence, and the second knowledge embedding vector is a knowledge embedding vector that represents a false knowledge of the input sentence. 15 . The device of claim 13 , wherein the processor is configured to: determine the second loss using a difference between the first score and the second score; and train the word embedding and the encoder based on the second loss. 16 . The device of claim 13 , wherein the processor is configured to: generate a first concatenated vector by concatenating the context embedding vector and the first knowledge embedding vector, and generate the first score using the first concatenated vector and a score function; and generate a second concatenated vector by concatenating the context embedding vector and the second knowledge embedding vector, and generate the second score using the second concatenated vector and the score function. 17 . The device of claim 16 , wherein the score function is a neural network comprising a plurality of fully-connected layers. 18 . The device of claim 12 , wherein the processor is configured to: train the word embedding to include knowledge information associated with the input sentence results of the word embedding; and train the encoder to include the knowledge information in context embedding vector results of the encoder. 19 . The device of claim 12 , wherein the processor is configured to: generate the knowledge embedding vectors using knowledge graph (KG) embedding. 20 . A device comprising: a memory storing a natural language processing model; and a processor configured to: perform word embedding on an input sentence; generate a context embedding vector by encoding a result of the word embedding using an encoder of the natural language processing model; and generate an output sentence corresponding to the input sentence by decoding the context embedding vector using a decoder of the natural language processing model, wherein respective results of each of the word embedding and the generating of the context embedding vector include information of one or more words of the sentence and knowledge information associated with the input sentence. 21 . A device comprising: a processor configured to: generate a c

Assignees

Inventors

Classifications

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Combinations of networks · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022092266A1 cover?
A method and device with natural language processing is disclosed. The method includes performing a word embedding of an input sentence, encoding a result of the word embedding, using an encoder of a natural language processing model, to generate a context embedding vector, decoding the context embedding vector, using a decoder of the natural language processing model, to generate an output sen…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Mar 24 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).