Federated ensemble learning from decentralized data with incremental and decremental updates

US2022121999A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022121999-A1
Application numberUS-202017073341-A
CountryUS
Kind codeA1
Filing dateOct 17, 2020
Priority dateOct 17, 2020
Publication dateApr 21, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer implemented method includes distributing a plurality of prediction models, where each of a plurality of clients initially includes at least one associated prediction model from the plurality of prediction models, among all of the plurality of clients to provide each of the plurality of clients with each of the plurality of prediction models. The plurality of prediction models is evaluated on at least a portion of a local dataset resident on each of the plurality of clients to output a quantification indicating how each of the prediction models fit at least the portion of the local dataset of each of the plurality of clients. An ensemble model is generated by applying weights to each of the plurality of prediction models based on a value, a gradient, and a Hessian matrix of a user-defined objective.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer implemented method comprising: distributing a plurality of prediction models, where each of a plurality of clients initially includes at least one associated prediction model from the plurality of prediction models, among all of the plurality of clients to provide each of the plurality of clients with each of the plurality of prediction models; evaluating each of the plurality of prediction models on at least a portion of a local dataset resident on each of the plurality of clients to output a quantification indicating how each of the prediction models fit at least the portion of the local dataset of each of the plurality of clients; and generating an ensemble model by applying weights to each of the plurality of prediction models based on a value, a gradient, and a Hessian matrix of a user-defined objective. 2 . The computer implemented method of claim 1 , wherein the ensemble model is generated in a decentralized manner without including an exchange of raw data among the plurality of clients. 3 . The computer implemented method of claim 1 , wherein the vector is independent of a size of each of the datasets resident on each of the plurality of clients. 4 . The computer implemented method of claim 1 , further comprising performing one or more rounds of distributed gradient descent on the weights. 5 . The computer implemented method of claim 1 , further comprising limiting a number of models from the plurality of models that are assigned a weight greater than zero to a predetermined threshold. 6 . The computer implemented method of claim 1 , further comprising excluding a number of clients, and additional associated prediction models resident on respective ones of the number of clients, from the plurality of clients. 7 . The computer implemented method of claim 1 , further comprising: adding an additional client to the plurality of clients; distributing each of the prediction models of the plurality of prediction models to the additional client; and distributing one or more additional models associated with the additional client to the plurality of clients such that the additional model is evaluated on datasets resident on each of the plurality of clients, and each of the plurality of models, including the one or more additional models, are evaluated on the additional client. 8 . The computer implemented method of claim 1 , further comprising, removing or ignoring the associated prediction model of a removed client from each of the plurality of clients. 9 . The computer implemented method of claim 1 , further comprising, upon determining that a model has changed on one of the plurality of clients, re-valuating the changed model on each of the plurality of clients on at least the portion of the dataset resident on each of the plurality of clients. 10 . The computer implemented method of claim 1 , further comprising, upon determining that at least the portion of the dataset of a changed client of the plurality of clients is changed, re-evaluating each of the plurality of models on at least the portion of the dataset of the changed client. 11 . The computer implemented method of claim 1 , further comprising optimizing the weights applied to each of the plurality of models to minimize error between a predicted label given by the ensemble model and a ground truth label. 12 . The computer implemented method of claim 1 , further comprising sending to a central server, each vector for each of the plurality of clients. 13 . A computerized federated ensemble learning system comprising: a plurality of clients in communication with a server; one or more prediction models resident at each of the plurality of clients; and a dataset resident at each of the plurality of clients, wherein: each of the one or more prediction models are distributed among each of the plurality of clients; each of the plurality of clients is configured to evaluate each of the plurality of prediction models on at least a portion of the dataset resident on each of the plurality of clients and output a quantification indicating how each of the prediction models fit at least the portion of the local dataset of each of the plurality of clients; and the server is configured to receive the vector from each of the plurality of clients and generate an ensemble model by applying weights to each of the plurality of prediction models. 14 . The computerized federated ensemble learning system of claim 13 , wherein the ensemble model is generated in a decentralized manner without including an exchange of raw data among the plurality of clients. 15 . The computerized federated ensemble learning system of claim 13 , wherein the vector is independent of a size of each dataset resident at each of the plurality of clients. 16 . The computerized federated ensemble learning system of claim 13 , wherein the server is configured to assign the weights such that a predetermined threshold number of models from the plurality of models are assigned a weight greater than zero. 17 . A non-transitory computer readable storage medium tangibly embodying a computer readable program code having computer readable instructions that, when executed, causes a computer device to carry out a method of improving computing efficiency of a computing device operating a federated learning system, the method comprising: distributing a plurality of prediction models, where each of a plurality of clients initially includes at least one associated prediction model from the plurality of prediction models, among all of the plurality of clients to provide each of the plurality of clients with each of the plurality of prediction models; evaluating each of the plurality of prediction models on at least a portion of a local dataset resident on each of the plurality of clients to output a quantification indicating how each of the prediction models fit at least the portion of the local dataset of each of the plurality of clients; and generating an ensemble model by applying weights to each of the plurality of prediction models based on a value, a gradient, and a Hessian matrix of a user-defined objective. 18 . The non-transitory computer readable storage medium of claim 17 , wherein the vector is independent of a size of each dataset resident on each of the plurality of clients. 19 . The non-transitory computer readable storage medium of claim 17 , wherein the execution of the code by the processor further configures the computing device to perform an act comprising limiting a number of models from the plurality of models that are assigned a weight greater than zero to a predetermined threshold. 20 . The non-transitory computer readable storage medium of claim 17 , wherein the execution of the code by the processor further configures the computing device to perform an act comprising optimizing the weights applied to each of the plurality of models to minimize error between a predicted label given by the ensemble model and a ground truth label.

Assignees

Inventors

Classifications

  • G06N20/20Primary

    Ensemble learning · CPC title

  • Inference or reasoning models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022121999A1 cover?
A computer implemented method includes distributing a plurality of prediction models, where each of a plurality of clients initially includes at least one associated prediction model from the plurality of prediction models, among all of the plurality of clients to provide each of the plurality of clients with each of the plurality of prediction models. The plurality of prediction models is eval…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N20/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 21 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).