What technology area does this patent fall under?

Primary CPC classification G06N3/047. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Mar 25 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Bayesian nonparametric learning of neural networks

US2021089878A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2021089878-A1
Application number	US-201916576927-A
Country	US
Kind code	A1
Filing date	Sep 20, 2019
Priority date	Sep 20, 2019
Publication date	Mar 25, 2021
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In federated learning problems, data is scattered across different servers and exchanging or pooling it is often impractical or prohibited. A Bayesian nonparametric framework is presented for federated learning with neural networks. Each data server is assumed to provide local neural network weights, which are modeled through our framework. An inference approach is presented that allows us to synthesize a more expressive global network without additional supervision, data pooling and with as few as a single communication round. The efficacy of the present invention on federated learning problems simulated from two popular image classification datasets is shown.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method for managing efficient machine learning, the method comprising: operating a network in which a plurality of client computing devices are communicatively coupled with a centralized computing device, wherein each of the plurality of client computing devices includes a local machine learning model that is pre-trained on locally accessible data, and wherein the locally accessible data has a common structure across all the plurality of client computing devices; accessing, by the centralized computing device, a plurality of artificial local neurons from each of the local machine learning models; clustering each of the plurality of artificial local neurons into a plurality of specific groups as part of a set of global neurons; and forming a global machine learning model layer by averaging the plurality of artificial local neurons previously clustered into one of a plurality of specific groups as part of a set of global neurons. 2 . The computer-implemented method of claim 1 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons is performed with permutation-invariant probabilistic matching each of the plurality of artificial local neurons using Bayesian nonparametrics. 3 . The computer-implemented method of claim 1 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons is performed with groups of weight vectors, bias vectors, or a combination of weight vectors and bias vectors associated with each of the plurality of artificial local neurons. 4 . The computer-implemented method of claim 1 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons is controlled by hyperparameters. 5 . The computer-implemented method of claim 1 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons results in one or more of the plurality of artificial local neurons being left unmatched. 6 . The computer-implemented method of claim 1 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons results a number of neurons in the set of global neurons being smaller than a numeric sum of all of the plurality of artificial local neurons. 7 . The computer-implemented method of claim 1 , wherein the accessing, by the centralized computing device, the plurality of artificial local neurons from each of the plurality of client computing devices requires only a single read communication between the centralized computing device and each of the plurality of client computing devices. 8 . The computer-implemented method of claim 1 , wherein each of the plurality of client computing devices includes a local machine learning model that is a multilayer artificial neural network. 9 . The computer-implemented method of claim 1 , wherein each of the plurality of client computing devices includes the local machine learning model that is pre-trained on locally accessible data in which the data changes overtime. 10 . The computer-implemented method of claim 1 , wherein the locally accessible data has a common structure that is both heterogeneous and overlapping across all the plurality of client computing devices. 11 . A computer system for managing efficient machine learning, the computer system comprising: a processor device; and a memory operably coupled to the processor device and storing computer-executable instructions causing: operating a network in which a plurality of client computing devices are communicatively coupled with a centralized computing device, wherein each of the plurality of client computing devices includes a local machine learning model that is pre-trained on locally accessible data, and wherein the locally accessible data has a common structure across all the plurality of client computing devices; accessing, by the centralized computing device, a plurality of artificial local neurons from each of the local machine learning models; clustering each of the plurality of artificial local neurons into a plurality of specific groups as part of a set of global neurons; and forming a global machine learning model layer by averaging the plurality of artificial local neurons previously clustered into one of a plurality of specific groups as part of a set of global neurons. 12 . The computer system of claim 11 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons is performed with permutation-invariant probabilistic matching each of the plurality of artificial local neurons using Bayesian nonparametrics. 13 . The computer system of claim 11 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons is performed with groups of weight vectors, bias vectors, or a combination of weight vectors and bias vectors associated with each of the plurality of artificial local neurons. 14 . The computer system of claim 11 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons is controlled by hyperparameters. 15 . The computer system of claim 11 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons results in one or more of the plurality of artificial local neurons being left unmatched. 16 . The computer system of claim 11 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons results a number of neurons in the set of global neurons being smaller than a numeric sum of all of the plurality of artificial local neurons. 17 . The computer system of claim 11 , wherein the accessing, by the centralized computing device, the plurality of artificial local neurons from each of the plurality of client computing devices requires only a single read communication between the centralized computing device and each of the plurality of client computing devices. 18 . The computer system of claim 11 , wherein each of the plurality of client computing devices includes a local machine learning model that is a multilayer artificial neural network. 19 . The computer system of claim 11 , wherein each of the plurality of client computing devices includes the local machine learning model that is pre-trained on locally accessible data in which the data changes overtime. 20 . A computer program product for managing efficient machine learning, the computer program product comprising: a non-transitory computer readable storage medium readable by a processing device and storing program instructions for execution by the processing device, said program instructions comprising: operating a network in which a plurality of client computing devices are communicatively coupled with a centralized computing device, wherein each of the plurality of client computing devices includes a local machine learning model that is pre-trained on locally accessible data, and wherein the locally accessible data

Assignees

Inventors

Classifications

G06N3/047Primary
Probabilistic or stochastic networks · CPC title
G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/09
Supervised learning · CPC title
G06N3/0499
Feedforward networks · CPC title

Patent family

Related publications grouped by family.

View patent family 74882130

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021089878A1 cover?: In federated learning problems, data is scattered across different servers and exchanging or pooling it is often impractical or prohibited. A Bayesian nonparametric framework is presented for federated learning with neural networks. Each data server is assumed to provide local neural network weights, which are modeled through our framework. An inference approach is presented that allows us to s…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06N3/047. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Mar 25 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Gated linear networks

Asynchronously training machine learning models across client devices for adaptive intelligence

Communication Efficient Federated Learning

Methods and apparatus for federated training of a neural network using trusted edge devices

Memory Efficient Scalable Deep Learning with Model Parallelization

Frequently asked questions