What technology area does this patent fall under?

Primary CPC classification G06N3/088. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Feb 08 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Systems and methods for artificial-intelligence model training using unsupervised domain adaptation with multi-source meta-distillation

US2024046107A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2024046107-A1
Application number	US-202217966568-A
Country	US
Kind code	A1
Filing date	Oct 14, 2022
Priority date	Aug 8, 2022
Publication date	Feb 8, 2024
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method has the steps of obtaining a set of training samples from one or more domains, using the set of training samples to query a plurality of artificial-intelligence (AI) models, combining the outputs of the queried AI models, and adapting a target AI model via knowledge distillation using the combined outputs.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: obtaining a set of training samples from one or more domains; using the set of training samples to query a plurality of artificial-intelligence (AI) models; combining the outputs of the queried AI models; and adapting a target AI model via knowledge distillation using the combined outputs. 2 . The method of claim 1 , wherein said combining the outputs of the queried AI models comprises: using a transformer encoder for combining the outputs of the queried AI models. 3 . The method of claim 1 , wherein said obtaining the set of training samples from the one or more domains comprises: obtaining the set of training samples from a plurality of domains, the set of training samples comprises a plurality of subsets of training samples obtained from the plurality of domains; wherein said using the set of training samples to query the plurality of AI models comprises: using each subset of training samples to query the plurality of AI models except an excluded AI model of the plurality of AI models; and wherein the excluded AI models of the plurality of subset of training samples are different AI models. 4 . The method of claim 1 , wherein said combining the outputs of the queried AI models comprises: weighting the outputs of the queried AI models, and combining the weighted outputs of the queried AI models to obtain a soft pseudo-label; and wherein said adapting the target AI model via the knowledge distillation using the combined outputs comprises: adapting the target AI model via the knowledge distillation using the soft pseudo-label. 5 . The method of claim 4 , wherein said adapting the target AI model via the knowledge distillation using the combined outputs and the soft pseudo-label comprises: querying the target AI model using the set of training samples; and adapting the target AI model via the knowledge distillation based on Kullback-Leibler (KL) divergence of the output of the queried target AI model and the soft pseudo-label. 6 . The method of claim 5 , wherein said adapting the target AI model via the knowledge distillation based on the KL divergence of the output of the queried target AI model and the soft pseudo-label comprises: minimizing the KL divergence using a gradient decent method. 7 . The method of claim 1 further comprising: evaluating a loss of the target AI model; and updating a plurality of parameters based on the evaluated loss; wherein the plurality of parameters comprises one or more first parameters of the target AI model and a parameter used in said combining the outputs of the queried AI models. 8 . The method of claim 7 , wherein said evaluating a loss of the target AI model comprises: querying the target AI model using a set of query samples, and evaluating a cross-entropy (CE) loss between the outputs of the queried target AI model and a set of labels corresponding to the set of query samples; and wherein said updating the plurality of parameters based on the evaluated loss comprises: updating the plurality of parameters by minimizing the CE loss. 9 . The method of claim 8 , wherein said updating the plurality of parameters by minimizing the CE loss comprises: updating the plurality of parameters by minimizing the CE loss using a gradient decent method. 10 . An apparatus comprising: at least one processor for performing actions comprising: obtaining a set of training samples from one or more domains; using the set of training samples to query a plurality of AI models; combining the outputs of the queried AI models; and adapting a target AI model via knowledge distillation using the combined outputs. 11 . The apparatus of claim 10 , wherein said combining the outputs of the queried AI models comprises: using a transformer encoder for combining the outputs of the queried AI models. 12 . The apparatus of claim 10 , wherein said obtaining the set of training samples from the one or more domains comprises: obtaining the set of training samples from a plurality of domains, the set of training samples comprises a plurality of subsets of training samples obtained from the plurality of domains; wherein said using the set of training samples to query the plurality of AI models comprises: using each subset of training samples to query the plurality of AI models except an excluded AI model of the plurality of AI models; and wherein the excluded AI models of the plurality of subset of training samples are different AI models. 13 . The apparatus of claim 10 , wherein said combining the outputs of the queried AI models comprises: weighting the outputs of the queried AI models, and combining the weighted outputs of the queried AI models to obtain a soft pseudo-label; and wherein said adapting the target AI model via the knowledge distillation using the combined outputs comprises: adapting the target AI model via the knowledge distillation using the soft pseudo-label. 14 . The apparatus of claim 13 , wherein said adapting the target AI model via the knowledge distillation using the combined outputs and the soft pseudo-label comprises: querying the target AI model using the set of training samples; and adapting the target AI model via the knowledge distillation based on KL divergence of the output of the queried target AI model and the soft pseudo-label. 15 . The apparatus of claim 10 , wherein the at least one processor is configured for performing further actions comprising: evaluating a loss of the target AI model; and updating a plurality of parameters based on the evaluated loss; wherein the plurality of parameters comprises one or more first parameters of the target AI model and a parameter used in said combining the outputs of the queried AI models. 16 . The apparatus of claim 15 , wherein said evaluating a loss of the target AI model comprises: querying the target AI model using a set of query samples, and evaluating a CE loss between the outputs of the queried target AI model and a set of labels corresponding to the set of query samples; and wherein said updating the plurality of parameters based on the evaluated loss comprises: updating the plurality of parameters by minimizing the CE loss. 17 . One or more non-transitory computer-readable storage devices comprising computer-executable instructions, wherein the instructions, when executed, cause a processing structure to perform actions comprising: obtaining a set of training samples from one or more domains; using the set of training samples to query a plurality of AI models; combining the outputs of the queried AI models; and adapting a target AI model via knowledge distillation using the combined outputs. 18 . The one or more non-transitory computer-readable storage devices of claim 17 , wherein said combining the outputs of the queried AI models comprises: using a transformer encoder for combining the outputs of the queried AI models. 19 . The one or more non-transitory computer-readable storage devices of claim 17 , wherein said obtaining the set of training samples from the one or more domains comprises: obtaining the set of training samples from a plurality of domains, the set of training samples comprises a plurality of subsets of training samples obtained from the plurality of domains; wherein said using the set of training samples to query the plurality of AI models comprises: using each subset of training sam

Assignees

Huawei Tech Co Ltd

Inventors

Classifications

G06N3/088Primary
Non-supervised learning, e.g. competitive learning · CPC title
G06N3/0427
Physics · mapped topic
G06N3/042
Knowledge-based neural networks; Logical representations of neural networks · CPC title
G06N3/08
Learning methods · CPC title
G06N3/045Primary
Combinations of networks · CPC title

Patent family

Related publications grouped by family.

View patent family 89769192

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024046107A1 cover?: A method has the steps of obtaining a set of training samples from one or more domains, using the set of training samples to query a plurality of artificial-intelligence (AI) models, combining the outputs of the queried AI models, and adapting a target AI model via knowledge distillation using the combined outputs.
Who is the assignee on this patent?: Huawei Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06N3/088. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Feb 08 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).