Who is the assignee on this patent?

Electronics & Telecommunications Res Inst

What technology area does this patent fall under?

Primary CPC classification G06F40/30. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method of domain-adapting large-capacity pre-trained language model using semantic chunk dynamic weight masking

US12488187B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12488187-B2
Application number	US-202318471538-A
Country	US
Kind code	B2
Filing date	Sep 21, 2023
Priority date	Dec 19, 2022
Publication date	Dec 2, 2025
Grant date	Dec 2, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A domain adaptation procedure, such as fine-tuning training, is required to utilize a large-capacity PLM for a specific domain. Attempts in existing research have been made to improve performance of a PLM through domain adaptor technology based on an N-gram in order to reduce errors on the basis of the results of domain text error analysis of the PLM. Proposed is a method of selecting a semantic chunk through a domain semantic chunk graph and PageRank based on the existing domain adaptor research, with an N-gram as the semantic chunk. Proposed is also a method of domain-adapting a large-capacity PLM using semantic chunk dynamic weight masking, which reflects an output value of a PLM rather than simply integrating embedding values of semantic chunks, in a semantic chunk domain adaptor technology.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of domain-adapting a pre-trained language model based on a transformer, the method comprising: generating an encoding hidden vector sequence on the basis of an input sentence; updating the encoding hidden vector sequence on the basis of a domain semantic chunk extracted from the input sentence; generating a decoding hidden vector sequence on the basis of an output sentence and the updated encoding hidden vector sequence; updating the decoding hidden vector sequence on the basis of a domain semantic chunk extracted from the output sentence; and computing an output probability on the basis of the updated decoding hidden vector sequence. 2 . The method of claim 1 , wherein, in the updating of the encoding hidden vector sequence, the encoding hidden vector sequence is generated by applying token separation, embedding, and self-attention on the basis of the input sentence. 3 . The method of claim 1 , wherein, in the updating of the decoding hidden vector sequence, an output embedding vector is generated by applying token separation and embedding on the basis of the output sentence, and the decoding hidden vector sequence is generated using cross attention and self-attention on the basis of the output embedding vector and the updated encoding hidden vector sequence. 4 . The method of claim 1 , wherein, in the updating of the encoding hidden vector sequence, an input sentence semantic chunk embedding vector is generated on the basis of the domain semantic chunk extracted from the input sentence, an encoding semantic chunk vector is generated by applying self-attention to the input sentence semantic chunk embedding vector, an encoding semantic chunk position matrix is generated on the basis of a positional relationship between a token of the input sentence and the domain semantic chunk extracted from the input sentence, and the encoding hidden vector sequence is updated on the basis of the encoding semantic chunk vector and the encoding semantic chunk position matrix. 5 . The method of claim 4 , wherein, in the updating of the encoding hidden vector sequence, an attention-based encoding semantic chunk position matrix is generated through an attention technique by applying a value of the elementwise product of the encoding semantic chunk vector and the encoding semantic chunk position matrix, as a key of attention, and applying the encoding hidden vector sequence as a query of attention, and the encoding hidden vector sequence is updated on the basis of the attention-based encoding semantic chunk position matrix and the encoding semantic chunk matrix. 6 . The method of claim 1 , wherein, in the updating of the decoding hidden vector sequence, an output sentence semantic chunk embedding vector is generated on the basis of the domain semantic chunk extracted from the output sentence, a decoding semantic chunk vector is generated by applying self-attention to the output sentence semantic chunk embedding vector, a decoding semantic chunk position matrix is generated on the basis of a positional relationship between a token of the output sentence and the domain semantic chunk extracted from the output sentence, and the decoding hidden vector sequence is updated on the basis of the decoding semantic chunk vector and the decoding semantic chunk position matrix. 7 . The method of claim 6 , wherein, in the updating of the decoding hidden vector sequence, an attention-based decoding semantic chunk position matrix is generated through an attention technique by applying a value of the elementwise product of the decoding semantic chunk vector and the decoding semantic chunk position matrix, as a key of attention, and applying the decoding hidden vector sequence as a query of attention, and The decoding hidden vector sequence is updated on the basis of the attention-based decoding semantic chunk position matrix and the decoding semantic chunk vector. 8 . A method of selecting a domain semantic chunk, the method comprising: selecting a predetermined number of N-grams on the basis of a domain corpus; generating embedding values of the N-grams; computing similarities between each of the N-grams on the basis of the embedding values of the N-grams; generating an N-gram graph on the basis of the similarities; determining values of N-gram nodes included in the N-gram graph on the basis of the N-gram graph; and selecting a domain semantic chunk from among the N-grams on the basis of the values of the N-gram nodes. 9 . The method of claim 8 , wherein, in the selecting of the predetermined number of N-grams, N-grams are extracted from the domain corpus, and the predetermined number of N-grams are selected through filtering based on frequencies of the extracted N-grams. 10 . The method of claim 8 , wherein, in the generating of the N-gram graph, the similarities between each of the N-grams are computed using an approximate nearest neighbor (ANN) technique on the basis of the embedding values of the N-grams. 11 . The method of claim 8 , wherein, in the determining of the values of the N-gram nodes, the values of the N-gram nodes are determined using a PageRank algorithm on the basis of the N-gram graph. 12 . An apparatus for domain-adapting a pre-trained language model, the apparatus comprising: a semantic chunk selection module configured to select a predetermined number of N-grams on the basis of a domain corpus, to generate an N-gram graph on the basis of embedding values of the N-grams, to select a domain semantic chunk from among the N-grams on the basis of the N-gram graph, and to store the selected domain semantic chunk in a semantic chunk DB; and a domain adaptation module configured to generate an encoding hidden vector sequence on the basis of an input sentence, to extract a domain semantic chunk, found as a result of searching the semantic chunk DB, from the input sentence, to update the encoding hidden vector sequence on the basis of the domain semantic chunk extracted from the input sentence, to generate a decoding hidden vector sequence on the basis of an output sentence and the updated encoding hidden vector sequence, to extract a domain semantic chunk, found as a result of searching the semantic chunk DB, from the output sentence, to update the decoding hidden vector sequence on the basis of the domain semantic chunk extracted from the output sentence, and to compute an output probability for the output sentence on the basis of the updated decoding hidden vector sequence. 13 . The apparatus of claim 12 , wherein the semantic chunk selection module computes similarities between each of the N-grams on the basis of the embedding values of the N-grams and generates the N-gram graph on the basis of the similarities. 14 . The apparatus of claim 12 , wherein the semantic chunk selection module determines values of N-gram nodes included in the N-gram graph on the basis of the N-gram graph and selects the domain semantic chunk from among the N-grams on the values of the N-gram nodes. 15 . The apparatus of claim 14 , wherein the semantic chunk selection module determines the values of the N-gram nodes using a PageRank algorithm on the basis of the N-gram graph. 16 . The apparatus of claim 12 , wherein the domain adaptation module generates the encoding hidden vector sequence by applying token separation, embedding, and self-attention on the basis of the input sentence. 17 . The apparatus of claim 12 , wherein the domain adaptation module generates an output embedding vector by applying token separation and embedding on the basis of th

Assignees

Electronics & Telecommunications Res Inst

Inventors

Classifications

G06F40/284
Lexical analysis, e.g. tokenisation or collocates · CPC title
G06F40/289
Phrasal analysis, e.g. finite state techniques or chunking · CPC title
G06N3/0455
Auto-encoder networks; Encoder-decoder networks · CPC title
G06F40/30Primary
Semantic analysis · CPC title
G06N3/096Primary
Transfer learning · CPC title

Patent family

Related publications grouped by family.

View patent family 91472715

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12488187B2 cover?: A domain adaptation procedure, such as fine-tuning training, is required to utilize a large-capacity PLM for a specific domain. Attempts in existing research have been made to improve performance of a PLM through domain adaptor technology based on an N-gram in order to reduce errors on the basis of the results of domain text error analysis of the PLM. Proposed is a method of selecting a semanti…
Who is the assignee on this patent?: Electronics & Telecommunications Res Inst
What technology area does this patent fall under?: Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).