Resource-Aware Training for Neural Networks
US-2020234128-A1 · Jul 23, 2020 · US
US12387107B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12387107-B2 |
| Application number | US-202418431680-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 2, 2024 |
| Priority date | May 16, 2023 |
| Publication date | Aug 12, 2025 |
| Grant date | Aug 12, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are described herein for a method of determining a similarity of each neuron in a layer of neurons of a neural network model to each other neuron in the layer of neurons. The method further includes determining a redundant set of neurons and a non-redundant set of neurons based on the similarity of each neuron in the layer. The method further includes fine tuning the set of non-redundant neurons using a first set of training data. The method further includes training the set of redundant neurons using a second set of training data.
Opening claim text (preview).
What is claimed is: 1. A method comprising: determining, for a layer of a neural network model trained to perform a first task, a similarity value between neurons in the layer by comparing each neuron's weight vector in the layer to each other neuron's weight vector in the layer, wherein the neural network model comprises a structure including an arrangement of one or more neurons in one or more layers; grouping the neurons in the layer into a first subset of neurons or a second subset of neurons based on their similarity values, wherein the first subset of neurons comprises a first subnetwork of the neural network model trained to perform the first task; and training the neural network model to perform a second task, wherein training the neural network model to perform the second task includes training a second subnetwork of the neural network model comprising the second subset of neurons, wherein a gradient is backpropagated to each neuron in the second subset of neurons and wherein the neural network model trained to perform the second task comprises the structure including the arrangement of the one or more neurons in the one or more layers. 2. The method of claim 1 , wherein clustering the neurons in the layer into the first subset of neurons or the second subset of neurons based on their similarity values further comprises: clustering neurons into the second subset of neurons responsive to determining that the similarity value of two or more neurons in the layer satisfy a threshold similarity score; and clustering neurons into the first subset of neurons responsive to determining that the similarity value does not satisfy the threshold similarity score. 3. The method of claim 2 , wherein clustering the neurons in the layer into the first subset of neurons further comprises: selecting a neuron from the second subset of neurons for inclusion in the first subset of neurons. 4. The method of claim 1 , further comprising: determining a second similarity value between neurons in the second subset of neurons and neurons in the second subset of neurons by comparing each neuron's weight vector in the second subset of neurons in the layer to each other neuron's weight vector in the second subset of neurons in the layer; and clustering each neuron of the second subset of neurons into a third subset of neurons or a fourth subset of neurons based on the second similarity value. 5. The method of claim 4 , further comprising: training the neural network model to perform a third task using the third subset of neurons, wherein a second gradient is backpropagated to each neuron in the third subset of neurons and wherein the neural network model trained to perform the third task comprises the structure including the arrangement of the one or more neurons in the one or more layers. 6. The method of claim 4 , wherein clustering each of the second subset of neurons into a third subset of neurons or a fourth subset of neurons based on the second similarity value further comprises: clustering neurons into the third subset of neurons responsive to determining that the second similarity value of two or more neurons in the second subset of neurons satisfy a threshold similarity score; and clustering neurons into the first subset of neurons responsive to determining that the second similarity value does not satisfy the threshold similarity score. 7. The method of claim 1 , further comprising: fine tuning the first subset of neurons using a gradient of a neuron of the first subset of neurons determined using a first set of training data, wherein the first set of training data is used to train the neural network model to perform the first task. 8. A non-transitory computer-readable medium storing executable instructions, which when executed by a processing device, cause the processing device to perform operations comprising: determining, for a layer of a neural network model trained to perform a first task, a similarity value between neurons in the layer by comparing each neuron's weight vector in the layer to each other neuron's weight vector in the layer, wherein the neural network model comprises a structure including an arrangement of one or more neurons in one or more layers; grouping the neurons in the layer into a first subset of neurons or a second subset of neurons based on their similarity values, wherein the first subset of neurons comprises a first subnetwork of the neural network model trained to perform the first task; and training the neural network model to perform a second task, wherein training the neural network model to perform the second task includes training a second subnetwork of the neural network model comprising the second subset of neurons, wherein a gradient is backpropagated to each neuron in the second subset of neurons and wherein the neural network model trained to perform the second task comprises the structure including the arrangement of the one or more neurons in the one or more layers. 9. The non-transitory computer-readable medium of claim 8 , wherein clustering the neurons in the layer into the first subset of neurons or the second subset of neurons based on their similarity values further comprises instructions that cause the processing device to perform operations comprising: clustering neurons into the second subset of neurons responsive to determining that the similarity value of two or more neurons in the layer satisfy a threshold similarity score; and clustering neurons into the first subset of neurons responsive to determining that the similarity value does not satisfy the threshold similarity score. 10. The non-transitory computer-readable medium of claim 9 , wherein clustering the neurons in the layer into the first subset of neurons further comprises instructions that cause the processing device to perform operations comprising: selecting a neuron from the second subset of neurons for inclusion in the first subset of neurons. 11. The non-transitory computer-readable medium of claim 8 , storing instructions that further cause the processing device to perform operations comprising: determining a second similarity value between neurons in the second subset of neurons and neurons in the second subset of neurons by comparing each neuron's weight vector in the second subset of neurons in the layer to each other neuron's weight vector in the second subset of neurons in the layer; and clustering each neuron of the second subset of neurons into a third subset of neurons or a fourth subset of neurons based on the second similarity value. 12. The non-transitory computer-readable medium of claim 11 , storing instructions that further cause the processing device to perform operations comprising: training the neural network model to perform a third task using the third subset of neurons, wherein a second gradient is backpropagated to each neuron in the third subset of neurons and wherein the neural network model trained to perform the third task comprises the structure including the arrangement of the one or more neurons in the one or more layers. 13. The non-transitory computer-readable medium of claim 11 , wherein clustering each of the second subset of neurons into a third subset of neurons or a fourth subset of neurons based on the second similarity value further comprises instructions that cause the processing device to perform operations comprising: clustering neurons into the third subset of neurons responsive to determining that the second similarity value of two or more neurons in the second subset of neurons satisfy a threshold similarity score; and clustering neurons into the first subset of neurons responsive
Transfer learning · CPC title
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.