Building of Custom Convolution Filter for a Neural Network Using an Automated Evolutionary Process

US2021174175A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021174175-A1
Application numberUS-201916705488-A
CountryUS
Kind codeA1
Filing dateDec 6, 2019
Priority dateDec 6, 2019
Publication dateJun 10, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Mechanisms are provided for synthesizing a computer implemented neural network. An initially trained neural network is received and modified by introducing a new hidden layer of neurons and new connections that connect the new hidden layer of neurons to an output layer and a previous layer of neurons previously directly connected to the output layer of neurons to generate a modified neural network. The modified neural network is trained through one or more epochs of machine learning to generate modified weight values for the new connections and the new connections are pruned based on the modified weight values to remove a subset of the new connections and leaving remaining connections in the modified neural network. A merge operation is performed on the remaining connections in the modified neural network to generate a custom convolution filter and modified neural network. The modified neural network is then retrained for deployment.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method, in a data processing system, for synthesizing a computer implemented neural network with a custom convolution filter, the method comprising: receiving an initially trained neural network comprising an initial set of weight values associated with connections in the initially trained neural network; modifying the initially trained neural network by introducing a first new hidden layer of neurons and new connections that connect the first new hidden layer of neurons to an output layer of neurons of the initially trained neural network and a previous layer of neurons previously directly connected to the output layer of neurons to generate a modified neural network; training the modified neural network through one or more epochs of machine learning to generate modified weight values for the new connections; pruning the new connections based on the modified weight values to remove a subset of the new connections and leaving remaining connections in the modified neural network; performing a merge operation on the remaining connections in the modified neural network to generate a custom convolution filter and modified neural network comprising the custom convolution filter; and retraining the modified neural network with one or more additional epochs of machine learning to generate a trained neural network, having the custom convolution filter, for deployment. 2 . The method of claim 1 , further comprising repeating the modifying of the initially trained neural network, training the modified neural network, pruning the new connections based on the modified weight values, and performing the merge operation for a second new hidden layer of neurons connected to the first new hidden layer. 3 . The method of claim 1 , further comprising: selecting a subset of neurons of the initially trained neural network as the previous layer of neurons. 4 . The method of claim 1 , wherein modifying the initially trained neural network further comprises setting weights of the new connections to values which do not perturb output values of output neurons of the output layer of neurons more than a predetermined threshold amount. 5 . The method of claim 1 , wherein the first hidden layer of neurons comprises a same number of neurons as neurons in the previous layer of neurons, and wherein weights and biases for the new connections are set to be the same as the weights and biases of connections between the previous layer of neurons and the output layer of neurons. 6 . The method of claim 1 , wherein pruning the new connections comprises selecting the remaining connections as a predetermined number of the new connections that have a relatively larger amount of change in associated weights, as a result of the training, than other ones of the new connections. 7 . The method of claim 1 , wherein pruning the new connections comprises selecting the remaining connections as a predetermined number of the new connections having a relatively larger associated weight, as a result of the training, than other ones of the new connections. 8 . The method of claim 1 , wherein pruning the new connections is performed in an iterative manner wherein in each iteration of the pruning, a fixed number of lowest weighted new connections are removed and the training is repeated for a fixed number of epochs. 9 . The method of claim 1 , wherein the merge operation performed on the remaining connections in the modified neural network comprises: generating a predetermined number of weight buckets; determining a measure of similarity or dissimilarity between weight vectors of neurons in the first new hidden layer; binning neurons into weight buckets of the predetermined number of weight buckets based on the measure of similarity or dissimilarity; and for each weight bucket, merging connections associated with neurons in the weight bucket together to generate a single merged neuron and corresponding merged set of connections. 10 . The method of claim 1 , wherein the method is executed by a graphics processing unit (GPU) of the data processing system utilizing acceleration capabilities of the GPU. 11 . A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a computing device, causes the computing device to synthesize a computer implemented neural network with a custom convolution filter, at least by: receiving an initially trained neural network comprising an initial set of weight values associated with connections in the initially trained neural network; modifying the initially trained neural network by introducing a first new hidden layer of neurons and new connections that connect the first new hidden layer of neurons to an output layer of neurons of the initially trained neural network and a previous layer of neurons previously directly connected to the output layer of neurons to generate a modified neural network; training the modified neural network through one or more epochs of machine learning to generate modified weight values for the new connections; pruning the new connections based on the modified weight values to remove a subset of the new connections and leaving remaining connections in the modified neural network; performing a merge operation on the remaining connections in the modified neural network to generate a custom convolution filter and modified neural network comprising the custom convolution filter; and retraining the modified neural network with one or more additional epochs of machine learning to generate a trained neural network, having the custom convolution filter, for deployment. 12 . The computer program product of claim 11 , wherein the computer readable program further causes the computing device to synthesize the computer implemented neural network with a custom convolution filter, at least by repeating the modifying of the initially trained neural network, training the modified neural network, pruning the new connections based on the modified weight values, and performing the merge operation for a second new hidden layer of neurons connected to the first new hidden layer. 13 . The computer program product of claim 11 , wherein the computer readable program further causes the computing device to synthesize the computer implemented neural network with a custom convolution filter, at least by selecting a subset of neurons of the initially trained neural network as the previous layer of neurons. 14 . The computer program product of claim 11 , wherein modifying the initially trained neural network further comprises setting weights of the new connections to values which do not perturb output values of output neurons of the output layer of neurons more than a predetermined threshold amount. 15 . The computer program product of claim 11 , wherein the first hidden layer of neurons comprises a same number of neurons as neurons in the previous layer of neurons, and wherein weights and biases for the new connections are set to be the same as the weights and biases of connections between the previous layer of neurons and the output layer of neurons. 16 . The computer program product of claim 11 , wherein pruning the new connections comprises selecting the remaining connections as a predetermined number of the new connections that have a relatively larger amount of change in associated weights, as a result of the training, than other ones of the new connections. 17 . The computer program product of claim 11 , wherein pru

Assignees

Inventors

Classifications

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

  • G06N3/045Primary

    Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021174175A1 cover?
Mechanisms are provided for synthesizing a computer implemented neural network. An initially trained neural network is received and modified by introducing a new hidden layer of neurons and new connections that connect the new hidden layer of neurons to an output layer and a previous layer of neurons previously directly connected to the output layer of neurons to generate a modified neural netw…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 10 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).