Meta-knowledge fine tuning method and platform for multi-task language model

US11354499B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11354499-B2
Application numberUS-202117531813-A
CountryUS
Kind codeB2
Filing dateNov 22, 2021
Priority dateNov 2, 2020
Publication dateJun 7, 2022
Grant dateJun 7, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed is a meta-knowledge fine tuning method and platform for a multi-task language model. The method is to obtain highly transferable shared knowledge, that is, meta-knowledge, on different data sets of tasks of the same category, perform interrelation and mutual reinforcement on the learning processes of the tasks of the same category that correspond to different data sets and are in different domains, so as to improve the fine tuning effect of downstream tasks of the same category on data sets of different domains in the application of the language model, and improve the parameter initialization ability and the generalization ability of a general language model for the tasks of the same category.

First claim

Opening claim text (preview).

What is claimed is: 1. A meta-knowledge fine tuning method for a multi-task language model, comprising the following stages: a first stage, calculating the prototypes of cross-domain data sets of tasks of the same category: embedded features of the prototypes of the corresponding domains of the tasks of the category is intensively learned from the data sets of different domains of the tasks of the same category, and the average embedded feature of all input texts of the tasks of the same category in different domains is taken as a corresponding multi-domain category prototype of the tasks of the same category; a second stage, calculating typical scores of instances: where d self represents the distance between the embedded feature of each instance and d others represents the distance between the embedded feature of each instance and other domain prototypes; and the typical score of each instance is defined as a linear combination of d self and d others ; and a third stage, a meta-knowledge fine tuning network based on typical scores: the typical scores obtained in the second stage is used as weight coefficients of the meta-knowledge fine tuning network, and a multi-task typical sensitive label classification loss function is designed as a learning objective function of meta-knowledge fine tuning; and the loss function penalizes the labels of the instances of all domains that the language model predicts incorrectly; wherein in the first stage, D m k represents a set of input texts x i k with a category label m in a k th domain D k of the data set: D m k ={x i k V( x i k ,y i k )∈ D k ,y i k =m} where m∈M, M represents a set of all category labels in the data set; and (x i k , y i k ) represents an i th instance in the k th domain; the category prototype c m k represents the average embedded feature of all input texts with the category label m in the k th domain: c m k = 1 D m k ⁢ ∑ x i k ∈ D m k ⁢ E ⁡ ( x i k ) wherein, ε(·) represents an embedded expression of x i k output by a BERT model; and for the BERT model, the average embedded feature is the average pooling of the last layer of Transformer encoder corresponding to the input x i k. 2. The meta-knowledge fine tuning method for the multi-task language model according to claim 1 , wherein in the second stage, the typical score t i k of the instance (x i k , y i k ) is expressed as: t i k = α ⁢ ∑ m ∈ M ⁢ β m ⁢ cos ⁡ ( E ⁡ ( x i k ) , c m k ) ∑ m ∈ M ⁢ β m + 1 - α K - 1 · ∑ k = 1 K ⁢ 1 ( k ~ ≠ k ) ⁢ ∑ m ∈ M ⁢ β m ⁢ cos ⁡ ( E ⁡ ( x i

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Matching criteria, e.g. proximity measures · CPC title

  • Supervised learning · CPC title

  • Transfer learning · CPC title

  • Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11354499B2 cover?
Disclosed is a meta-knowledge fine tuning method and platform for a multi-task language model. The method is to obtain highly transferable shared knowledge, that is, meta-knowledge, on different data sets of tasks of the same category, perform interrelation and mutual reinforcement on the learning processes of the tasks of the same category that correspond to different data sets and are in diff…
Who is the assignee on this patent?
Zhejiang Lab
What technology area does this patent fall under?
Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 07 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).