Self-pruning neural networks for weight parameter reduction

US2022129756A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022129756-A1
Application numberUS-202217572625-A
CountryUS
Kind codeA1
Filing dateJan 10, 2022
Priority dateDec 12, 2017
Publication dateApr 28, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A technique to prune weights of a neural network using an analytic threshold function h(w) provides a neural network having weights that have been optimally pruned. The neural network includes a plurality of layers in which each layer includes a set of weights w associated with the layer that enhance a speed performance of the neural network, an accuracy of the neural network, or a combination thereof. Each set of weights is based on a cost function C that has been minimized by back-propagating an output of the neural network in response to input training data. The cost function C is also minimized based on a derivative of the cost function C with respect to a first parameter of the analytic threshold function h(w) and on a derivative of the cost function C with respect to a second parameter of the analytic threshold function h(w).

First claim

Opening claim text (preview).

What is claimed is: 1 . A data-processing device, comprising: a processor; and a memory, the data-processing device being configured as a neural network comprising a plurality of layers, at least one layer of the plurality of layers comprising a convolutional layer, each layer of the plurality of layers comprising a set of weights w associated with the layer that enhance a speed performance of the neural network, an accuracy of the neural network, or a combination thereof, each set of weights being pruned using an analytic threshold function h(w), the analytic threshold function h(w) comprising a first predetermined value for a first set of continuous weight values centered around 0, and a second predetermined value for a second set of continuous weight values and for a third set of continuous weight values, the first predetermined value being different from the second predetermined value, the second set of continuous weight values being different from and greater than the first set of continuous weight values and the third set of continuous weight values being different from and less than the first set of continuous weight values, when graphed the analytic threshold function h(w) comprising a first parameter that sets a sharpness characteristic of a first edge and of a second edge of the analytic threshold function h(w) between the first predetermined value and the second predetermined value, and a second parameter that sets a distance between the first edge and the second edge of the analytic threshold function h(w). 2 . The data-processing device of claim 1 , wherein the first predetermined value equals 0 and the second predetermined value equals 1. 3 . The data-processing device of claim 1 , wherein the analytic threshold function h(w) further comprising a first edge between the first set of continuous weight values and the second set of continuous weight values and a second edge between the first set of continuous weight values and the third set of continuous weight values, the sharpness characteristic of the first edge and of the second edge between the first predetermined value and the second predetermined value being based on a value of the first parameter of the analytic threshold function h(w) and the distance between the first and second edges being based on a value of the second parameter of the analytic threshold function h(w). 4 . The data-processing device of claim 3 , wherein the analytic threshold function h(w) is proportional to β and inversely proportional to α, in which α is the first parameter, and β is the second parameter. 5 . The data-processing device of claim 4 , wherein an initial value for the first parameter α and an initial value for the second parameter β is based on a partial second derivative of the analytic threshold function h(w) with respect to w being equal to zero. 6 . The data-processing device of claim 5 , wherein the first parameter α and the second parameter β for each set of weights is based a cost function C that is minimized by back-propagating an output of the neural network in response to input training data, on a derivative of the cost function C with respect to a first parameter of the analytic threshold function h(w) and on a derivative of the cost function C with respect to a second parameter of the analytic threshold function h(w), and wherein the cost function C is based on a number of layers, an index of weights in a final layer and one or regularization parameters. 7 . The data-processing device of claim 6 , wherein the cost function C is minimized based on the derivative of the cost function C with respect to the first parameter α by updating the first parameter a during back-propagating the output through the neural network, and wherein the cost function C is minimized based on the derivative of the cost function C with respect to the second parameter β by updating the second parameter β during back-propagating the output through the neural network. 8 . The data-processing device of claim 7 , wherein the cost function C is further minimized by updating values for weights w of each set of weights during back-propagating the output through the neural network. 9 . The data-processing device of claim 1 , wherein the neural network comprises a deep neural network. 10 . A method to prune weights of a neural network, the method comprising: forming a weight function f(w) for weights w associated with each layer of a plurality of layers of the neural network based on an analytic threshold function h(w), the analytic threshold function h(w) comprising a first predetermined value for a first set of continuous weight values centered around 0, and a second predetermined value for a second set of continuous weight values and for a third set of continuous weight values, the first predetermined value being different from the second predetermined value, the second set of continuous weight values being different from and greater than the first set of continuous weight values and the third set of continuous weight values being different from and less than the first set of continuous weight values, when graphed the analytic threshold function h(w) further comprising a first edge between the first set of continuous weight values and the second set of continuous weight values and a second edge between the first set of continuous weight values and the third set of continuous weight values, a sharpness characteristic of each of the first and second edges between the first predetermined value and the second predetermined value being based on a value of a first parameter of the analytic threshold function h(w) and a distance between the first and second edges being based on a value of a second parameter of the analytic threshold function h(w); inputting training data to the neural network to generate an output based on the training data; back-propagating the output through the neural network; and minimizing a difference between the output and the training data to determine a set of weights w that enhance a speed performance of the neural network, an accuracy of the neural network, or a combination thereof, by minimizing a cost function C based on a derivative of the cost function C with respect to the first parameter and based on a derivative of the cost function C with respect to the second parameter. 11 . The method of claim 10 , wherein the analytic threshold function h(w) is proportional to β and inversely proportional to α, in which α is the first parameter, and β is the second parameter. 12 . The method of claim 11 , further comprising initializing the first parameter α and the second parameter β based on a partial second derivative of the analytic threshold function h(w) with respect to w being equal to zero. 13 . The method of claim 11 , wherein the weight function f(w) comprises weights of the neural network multiplied by the analytic threshold function h(w). 14 . The method of claim 13 , wherein the cost function C is based on a number of layers, an index of weights in a final layer and one or regularization parameters. 15 . The method of claim 14 , wherein minimizing the cost function C based on the derivative of the cost function C with respect to the first parameter a comprises updating the first parameter a during back-propagating the output through the neural network, and wherein minimizing the cost function C based on the derivative of the cost function C with respect to the second parameter (3 comprises updating the second parameter (3 during back-propagating the output through the neural network. 16 . The method of claim 15 ,

Assignees

Inventors

Classifications

  • Architecture, e.g. interconnection topology · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • Supervised learning · CPC title

  • G06N3/082Primary

    modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022129756A1 cover?
A technique to prune weights of a neural network using an analytic threshold function h(w) provides a neural network having weights that have been optimally pruned. The neural network includes a plurality of layers in which each layer includes a set of weights w associated with the layer that enhance a speed performance of the neural network, an accuracy of the neural network, or a combination …
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/082. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 28 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).