Efficient heterogeneous federated learning method and system based on hybrid distillation, device, and medium

US2026057246A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2026057246-A1
Application numberUS-202519371654-A
CountryUS
Kind codeA1
Filing dateOct 28, 2025
Priority dateJul 22, 2024
Publication dateFeb 26, 2026
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An efficient heterogeneous federated learning method based on hybrid distillation includes: initializing, by a server, global model parameters, and setting a preset total number of training rounds and a number of clients participating in each of the training rounds; loading local datasets in the clients respectively, performing random transformations on the local datasets to generate client distillation data for the clients, sampling multiple sub-networks from an original network of each client, training each sub-network on the client distillation data to obtain updated local model parameters of each client, and uploading the updated local model parameters to the server; and receiving, by the server, the updated local model parameters, performing, by the server, server distillation based on the updated local model parameters and a preset auxiliary dataset to obtain updated global model parameters and an updated global model, and sending, by the server, the updated global model to the clients.

First claim

Opening claim text (preview).

What is claimed is: 1 . An efficient heterogeneous federated learning method based on hybrid distillation, comprising: step 1, initializing, by a server, global model parameters, and setting a preset total number of training rounds and a number of clients participating in each of the training rounds; step 2, loading local datasets in the clients respectively, performing random transformations on the local datasets to generate client distillation data for the clients, sampling a plurality of sub-networks from an original network of each of the clients, training each of the plurality of sub-networks based on the client distillation data of each of the clients to obtain updated local model parameters of each of the clients, and uploading the updated local model parameters to the server; step 3, receiving, by the server, the updated local model parameters of each of the clients, performing, by the server, server distillation based on the updated local model parameters of each of the clients and a preset auxiliary dataset to obtain updated global model parameters and an updated global model, and sending, by the server, the updated global model to the clients; and repeating step 2 to step 3 until the updated global model converges. 2 . The efficient heterogeneous federated learning method based on hybrid distillation as claimed in claim 1 , wherein the performing random transformations on the local datasets comprises: performing scaling and rotating on the local datasets to obtain the client distillation data for the clients. 3 . The efficient heterogeneous federated learning method based on hybrid distillation as claimed in claim 1 , wherein the plurality of sub-networks have different network score widths. 4 . The efficient heterogeneous federated learning method based on hybrid distillation as claimed in claim 1 , wherein the training each of the plurality of sub-networks based on the client distillation data of each of the clients comprises: calculating a Kullback-Leibler (KL) divergence between a softmax output of each of the plurality of sub-networks and an original softmax output of a local model of a corresponding one of the clients as a distillation loss, and dynamically assigning weights for each of the plurality of sub-networks based on prediction confidence of each of the plurality of sub-networks; and updating, based on the distillation loss and a traditional cross-entropy loss, local model parameters of each of the clients by using an optimization algorithm to obtain the updated local model parameters of each of the clients. 5 . The efficient heterogeneous federated learning method based on hybrid distillation as claimed in claim 1 , wherein the step 3 specifically comprises: receiving, by the server, the updated local model parameters of each of the clients from the clients, and preforming, by the server, weight aggregation on the updated local model parameters of each of the clients to obtain a global model; and performing, based on the preset auxiliary dataset and by using a combination of soft prediction distillation and feature distillation, distillation on the global model to obtain the updated global model parameters and the updated global model, and sending the updated global model back to the clients. 6 . An efficient heterogeneous federated learning system based on hybrid distillation as claimed in claim 1 , comprising: an initialization module, configured to make a server initialize global model parameters, and set a preset total number of training rounds and a number of clients participating in each of the training rounds; a client distillation module, configured to load local datasets in the clients respectively, perform random transformations on the local datasets to generate client distillation data for the clients, sample a plurality of sub-networks from an original network of each of the clients, train each of the plurality of sub-networks based on the client distillation data of each of the clients to obtain updated local model parameters of each of the clients, and upload the updated local model parameters to the server; and a server distillation module, configured for the server to receive the updated local model parameters of each of the clients, perform server distillation based on the updated local model parameters of each of the clients and a preset auxiliary dataset to obtain updated global model parameters and an updated global model, and send the updated global model to the clients; wherein the client distillation module and the server distillation module are further configured to repeat above steps until the updated global model converges. 7 . An electronic device, comprising: a memory and a processor, wherein the memory is configured to store a computer program, and the processor is configured to execute the computer program to make the electronic device implement the efficient heterogeneous federated learning method based on hybrid distillation as claimed in claim 1 . 8 . A computer-readable storage medium, wherein the computer-readable storage medium is stored with a computer program, and the computer program is configured to, when executed by a processor, implement the efficient heterogeneous federated learning method based on hybrid distillation as claimed in claim 1 .

Assignees

Inventors

Classifications

  • Learning methods · CPC title

  • Combinations of networks · CPC title

  • Activation functions · CPC title

  • G06N3/098Primary

    Distributed learning, e.g. federated learning · CPC title

  • Backpropagation, e.g. using gradient descent · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2026057246A1 cover?
An efficient heterogeneous federated learning method based on hybrid distillation includes: initializing, by a server, global model parameters, and setting a preset total number of training rounds and a number of clients participating in each of the training rounds; loading local datasets in the clients respectively, performing random transformations on the local datasets to generate client dis…
Who is the assignee on this patent?
Univ Dongguan Technology
What technology area does this patent fall under?
Primary CPC classification G06N3/098. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Feb 26 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).