Method and apparatus for training few-shot event detection model based on multilingual prompt learning

US12299584B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12299584-B2
Application numberUS-202418774585-A
CountryUS
Kind codeB2
Filing dateJul 16, 2024
Priority dateJul 17, 2023
Publication dateMay 13, 2025
Grant dateMay 13, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and apparatus for training a few-shot event detection model based on multilingual prompt learning are provided, which includes: acquiring a training data set, applying a multilingual prompt model to any instance to obtain a predicted probability distribution of a trigger tag, so as to obtain a first loss; generating a contrastive instance and a bilingual instance, and performing multilingual prompt and cross-lingual encoding according to the input instance and the bilingual instance by applying the multilingual prompt model to obtain joint event characterization; performing event tag prediction on the joint event characterization by applying a two-level hierarchical prototype network model, and calculating a second loss; performing contrastive learning on respective instances by applying a quaternary contrastive learning module to obtain a third loss; determining a total loss of the few-shot event detection model according to respective losses, and performing model training optimization based on the total loss.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, executed by a processor, for training a few-shot event detection model based on multilingual prompt learning, applied to an electronic device, the processor, a memory, an input/output interface and a communication interface in the electronic device implementing communication connection with each other in the electronic device through a bus, comprising: extracting a training data set comprising a plurality of instances from a preset network database by applying a data collector, each of the plurality of instances comprising event text, a real trigger tag and a real event tag; taking any instance in the training data set as an input instance, and performing event triggering recognition by applying a multilingual prompt model based on the event text of the input instance to obtain a predicted probability distribution of a trigger tag and obtain a first loss; generating a contrastive instance and a bilingual instance corresponding to the input instance according to the input instance, and performing multilingual prompting and performing cross-lingual encoding according to the input instance and the bilingual instance by applying the multilingual prompt model to obtain joint event characterization; performing event tag prediction on the joint event characterization by applying a two-level hierarchical prototype network model that is a hierarchical neural network and comprises multiple layers where each successive layer captures increasingly abstract features from input data, and the two-level hierarchical prototype network model comprises a parent prototype network and a child prototype network, and calculating a second loss of the event tag prediction; performing quaternary contrastive learning by applying a quaternary contrastive learning module according to the input instance, the contrastive instance and the joint event characterization, and obtaining a third loss; and determining a total loss of the few-shot event detection model according to the first loss, the second loss and the third loss, the few-shot event detection model comprising the multilingual prompt model, the two-level hierarchical prototype network model and the quaternary contrastive learning module, and performing model training optimization on the few-shot event detection model based on the total loss; inputting event text to be detected by applying the input/output interface; performing language prompt learning on the event text to be detected based on a first prompt template in the multilingual prompt model to obtain a corresponding language prompt, and then performing cross-lingual encoding by applying a cross-lingual encoding model in the multilingual prompt model, which is implemented by using an encoder xlm-roberta, to obtain a trigger tag and event embedding; calculating a child tag probability distribution for which the event embedding is classified into a respective child tag in the child prototype network by applying the two-level hierarchical prototype network model, and determining a maximum value of child tag probability as a finally recognized event tag, so as to perform few-shot event management according to the detected trigger tag and the event tag, wherein the two-level hierarchical prototype network model is a prototype network, and the memory being configured for storing the event text to be detected, the trigger tag and the event tag; and wherein the encoder xlm-roberta is a scaled cross-lingual sentence encoder, and is pre-trained on 2.5 terabytes (TB) of data across 100 languages using data filtered from Common Crawl, and achieves results on multiple cross-lingual benchmarks. 2. The method according to claim 1 , wherein the performing the event triggering recognition by applying the multilingual prompt model based on the event text of the input instance to obtain the predicted probability distribution of the trigger tag and obtain the first loss comprises: performing language prompt processing on the event text of the input instance by applying the first prompt template in the multilingual prompt model to obtain a modified language prompt; performing the event triggering recognition by applying the cross-lingual encoding model in the multilingual prompt model according to the language prompt corresponding to the input instance to obtain the predicted probability distribution of the trigger tag; and calculating the first loss according to the predicted probability distribution of the trigger tag. 3. The method according to claim 1 , wherein the bilingual instance comprises a Chinese instance and a Spanish instance, and the performing multilingual prompt and cross-lingual encoding according to the input instance and the bilingual instance by applying the multilingual prompt model to obtain the joint event characterization comprises: obtaining a first language prompt based on a real trigger tag in the input instance by applying an English prompt template in the multilingual prompt model according to the input instance; obtaining a second language prompt and a third language prompt based on the real trigger tag by respectively applying a Chinese prompt template and a Spanish prompt template in the multilingual prompt model according to the Chinese instance and the Spanish instance; performing cross-lingual encoding by applying the cross-lingual encoding model in the multilingual prompt model respectively according to the first language prompt, the second language prompt and the third language prompt to generate a corresponding first event embedding, second event embedding and third event embedding; and calculating an average of the first event embedding, the second event embedding and the third event embedding to obtain the joint event characterization of the event embedding, wherein the first language prompt f e is obtained based on a real trigger tag {circumflex over (t)} the input instance by applying an English prompt template in the multilingual prompt model according to the input instance: f e ( x,{circumflex over (t)} )=[CLS]([ {circumflex over (t)} ]triggers[ z y ]event.)[SEP][ x ][SEP], wherein the second language prompt f e zh and the third language prompt f e es are obtained based on the real trigger tag by respectively applying the Chinese prompt template and the Spanish prompt template in the multilingual prompt model according to the Chinese instance and the Spanish instance. 4. The method according to claim 1 , wherein the performing event tag prediction on the joint event characterization by applying the two-level hierarchical prototype network model and calculating the second loss of the event tag prediction comprises: respectively calculating a parent tag probability distribution for which the joint event characterization is classified into respective parent tags in the parent prototype network and the child tag probability distribution for which the joint event characterization is classified into respective child tags in the child prototype network; calculating a loss of the parent prototype network based on the parent tag probability distribution, and calculating a loss of the child prototype network based on the child tag probability distribution; and weighted summing the loss of the parent prototype network and the loss of the child prototype network to obtain the second loss of the two-level hierarchical prototype network model. 5. The method according to claim 4 , wherein the respectively calculating the parent tag probability distribution for which the joint event characterization is classified into the respective parent tags in the parent prototype network and the child tag probability distribution for which the joint event characterization is classified into the respective child tags in the child prototype network comprises: calculating a fi

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Learning methods · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12299584B2 cover?
A method and apparatus for training a few-shot event detection model based on multilingual prompt learning are provided, which includes: acquiring a training data set, applying a multilingual prompt model to any instance to obtain a predicted probability distribution of a trigger tag, so as to obtain a first loss; generating a contrastive instance and a bilingual instance, and performing multil…
Who is the assignee on this patent?
National Univ Of Defense Technology
What technology area does this patent fall under?
Primary CPC classification G06N3/091. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 13 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).