Gated linear networks
US-2020349418-A1 · Nov 5, 2020 · US
US2021089878A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2021089878-A1 |
| Application number | US-201916576927-A |
| Country | US |
| Kind code | A1 |
| Filing date | Sep 20, 2019 |
| Priority date | Sep 20, 2019 |
| Publication date | Mar 25, 2021 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In federated learning problems, data is scattered across different servers and exchanging or pooling it is often impractical or prohibited. A Bayesian nonparametric framework is presented for federated learning with neural networks. Each data server is assumed to provide local neural network weights, which are modeled through our framework. An inference approach is presented that allows us to synthesize a more expressive global network without additional supervision, data pooling and with as few as a single communication round. The efficacy of the present invention on federated learning problems simulated from two popular image classification datasets is shown.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method for managing efficient machine learning, the method comprising: operating a network in which a plurality of client computing devices are communicatively coupled with a centralized computing device, wherein each of the plurality of client computing devices includes a local machine learning model that is pre-trained on locally accessible data, and wherein the locally accessible data has a common structure across all the plurality of client computing devices; accessing, by the centralized computing device, a plurality of artificial local neurons from each of the local machine learning models; clustering each of the plurality of artificial local neurons into a plurality of specific groups as part of a set of global neurons; and forming a global machine learning model layer by averaging the plurality of artificial local neurons previously clustered into one of a plurality of specific groups as part of a set of global neurons. 2 . The computer-implemented method of claim 1 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons is performed with permutation-invariant probabilistic matching each of the plurality of artificial local neurons using Bayesian nonparametrics. 3 . The computer-implemented method of claim 1 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons is performed with groups of weight vectors, bias vectors, or a combination of weight vectors and bias vectors associated with each of the plurality of artificial local neurons. 4 . The computer-implemented method of claim 1 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons is controlled by hyperparameters. 5 . The computer-implemented method of claim 1 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons results in one or more of the plurality of artificial local neurons being left unmatched. 6 . The computer-implemented method of claim 1 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons results a number of neurons in the set of global neurons being smaller than a numeric sum of all of the plurality of artificial local neurons. 7 . The computer-implemented method of claim 1 , wherein the accessing, by the centralized computing device, the plurality of artificial local neurons from each of the plurality of client computing devices requires only a single read communication between the centralized computing device and each of the plurality of client computing devices. 8 . The computer-implemented method of claim 1 , wherein each of the plurality of client computing devices includes a local machine learning model that is a multilayer artificial neural network. 9 . The computer-implemented method of claim 1 , wherein each of the plurality of client computing devices includes the local machine learning model that is pre-trained on locally accessible data in which the data changes overtime. 10 . The computer-implemented method of claim 1 , wherein the locally accessible data has a common structure that is both heterogeneous and overlapping across all the plurality of client computing devices. 11 . A computer system for managing efficient machine learning, the computer system comprising: a processor device; and a memory operably coupled to the processor device and storing computer-executable instructions causing: operating a network in which a plurality of client computing devices are communicatively coupled with a centralized computing device, wherein each of the plurality of client computing devices includes a local machine learning model that is pre-trained on locally accessible data, and wherein the locally accessible data has a common structure across all the plurality of client computing devices; accessing, by the centralized computing device, a plurality of artificial local neurons from each of the local machine learning models; clustering each of the plurality of artificial local neurons into a plurality of specific groups as part of a set of global neurons; and forming a global machine learning model layer by averaging the plurality of artificial local neurons previously clustered into one of a plurality of specific groups as part of a set of global neurons. 12 . The computer system of claim 11 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons is performed with permutation-invariant probabilistic matching each of the plurality of artificial local neurons using Bayesian nonparametrics. 13 . The computer system of claim 11 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons is performed with groups of weight vectors, bias vectors, or a combination of weight vectors and bias vectors associated with each of the plurality of artificial local neurons. 14 . The computer system of claim 11 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons is controlled by hyperparameters. 15 . The computer system of claim 11 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons results in one or more of the plurality of artificial local neurons being left unmatched. 16 . The computer system of claim 11 , wherein the clustering each of the plurality of artificial local neurons into the plurality of specific groups as part of the set of global neurons results a number of neurons in the set of global neurons being smaller than a numeric sum of all of the plurality of artificial local neurons. 17 . The computer system of claim 11 , wherein the accessing, by the centralized computing device, the plurality of artificial local neurons from each of the plurality of client computing devices requires only a single read communication between the centralized computing device and each of the plurality of client computing devices. 18 . The computer system of claim 11 , wherein each of the plurality of client computing devices includes a local machine learning model that is a multilayer artificial neural network. 19 . The computer system of claim 11 , wherein each of the plurality of client computing devices includes the local machine learning model that is pre-trained on locally accessible data in which the data changes overtime. 20 . A computer program product for managing efficient machine learning, the computer program product comprising: a non-transitory computer readable storage medium readable by a processing device and storing program instructions for execution by the processing device, said program instructions comprising: operating a network in which a plurality of client computing devices are communicatively coupled with a centralized computing device, wherein each of the plurality of client computing devices includes a local machine learning model that is pre-trained on locally accessible data, and wherein the locally accessible data
Probabilistic or stochastic networks · CPC title
Recurrent networks, e.g. Hopfield networks · CPC title
Combinations of networks · CPC title
Supervised learning · CPC title
Feedforward networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.