Neural network method and apparatus

US2024346317A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024346317-A1
Application numberUS-202418752163-A
CountryUS
Kind codeA1
Filing dateJun 24, 2024
Priority dateSep 16, 2019
Publication dateOct 17, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and apparatus for the pruning of a neural network is provided. The method sets a weight threshold value based on a weight distribution of layers included in a neural network, predicts a change of inference accuracy of a neural network by pruning of each layer based on the weight threshold value, determines a current subject layer to be pruned with a weight threshold value among the layers included in the neural network, and prunes a determined current subject layer.

First claim

Opening claim text (preview).

What is claimed is: 1 . A processor-implemented neural network method of one of more processors pruning a neural network, the method comprising: generating a resultant pruned neural network through a performing of plural pruning iterations by the one or more processors until a number of all layers of the neural network or at least all of the layers have been pruned, the plural pruning iterations respectively including: determining a weight threshold value to prune the neural network to a target pruning rate, based on a weight distribution of layers included in the neural network; pruning plural layers of the neural network by the one or more processors based on the determined weight threshold value; calculating a sensitivity of each of the layers corresponding to a change in inference accuracy of the neural network based on an input pruning data set; determining a current subject layer to be pruned among each of the layers of the neural network, based on the calculated sensitivity; and generating a pruned neural network by pruning the determined current subject layer. 2 . The method of claim 1 , wherein the generating of the resultant pruned neural network comprises generating the resultant pruned neural network until an inference accuracy of a previously pruned neural network is determined to meet a target inference accuracy threshold. 3 . The method of claim 1 , wherein the determining of the current subject layer to be pruned comprises determining a subject layer that has a lowest sensitivity among the calculated sensitivities as the current subject layer to be pruned. 4 . The method of claim 3 , wherein the lowest sensitivity represents that the determined subject layer has a least effect on a decrease in an inference accuracy of a previously pruned neural network compared to an inference accuracy of a currently trained neural network. 5 . The method of claim 1 , wherein the change in the inference accuracy is predicted based on a difference between an inference accuracy for each of the layers before pruning on each layer is performed and an inference accuracy for each of the layers after pruning on each of the layers is performed. 6 . The method of claim 1 , wherein the determining of the weight threshold value comprises determining a weight value corresponding to the target pruning rate to be the weight threshold value when the weight distribution corresponds to a standard normal distribution. 7 . The method of claim 1 , wherein the generating of the pruned neural network comprises pruning the current subject layer by adjusting a pruning rate of weights of the current subject layer by updating the weight threshold value until the inference accuracy of the neural network based on the input pruning data set is decreased to a threshold accuracy. 8 . The method of claim 7 , wherein the updating of the weight threshold value comprises increasing a current weight threshold value when the inference accuracy of the neural network that includes weights pruned to the current weight threshold value is not decreased to the threshold accuracy. 9 . The method of claim 1 , wherein the input pruning data set comprises one of a data set generated by randomly extracting a predetermined number of data sources for each class included in a given data set, or a data set generated by selecting valid classes from the given data set and randomly extracting a predetermined number of data sources for each selected valid class. 10 . The method of claim 1 , wherein the generating of the pruned neural network is performed without retraining of the pruned neural network using the input pruning data set. 11 . A neural network apparatus comprising: one or more processors configured to execute computer-readable instructions; and one or more memories storing the computer-readable instructions, which when executed by the one or more processors configure the one or more processors to generate a resultant pruned neural network through a performing of plural pruning iterations by the one or more processors until a number of all layers of the neural network or at least all of the layers have been pruned, the plural pruning iterations respectively including: a determination of a weight threshold value to prune the neural network to a target pruning rate, based on a weight distribution of layers included in the neural network; a pruning of plural layers of the neural network by the one or more processors based on the determined weight threshold value; a calculation of a sensitivity of each of the layers corresponding to a change in inference accuracy of the neural network based on an input pruning data set; a determination of a current subject layer to be pruned among each of the layers of the neural network, based on the calculated sensitivity; and a generation of a pruned neural network by pruning the determined current subject layer. 12 . The apparatus of claim 11 , wherein the plural pruning iterations is repeated until an inference accuracy of a previously pruned neural network is determined to meet a target inference accuracy threshold. 13 . The apparatus of claim 11 , wherein the one or more processors are further configured to determine a subject layer that has a lowest sensitivity among the calculated sensitivities as the current subject layer to be pruned. 14 . The apparatus of claim 13 , wherein the lowest sensitivity represents that the determined subject layer has a least effect on a decrease in an inference accuracy of a previously pruned neural network compared to an inference accuracy of a currently trained neural network. 15 . The apparatus of claim 11 , wherein the change in the inference accuracy is predicted based on a difference between an inference accuracy for each of the layers before pruning on each layer is performed and an inference accuracy for each of the layers after pruning on each of the layers is performed. 16 . The apparatus of claim 11 , wherein the one or more processors are further configured to determine a weight value corresponding to the target pruning rate to be the weight threshold value when the weight distribution corresponds to a standard normal distribution. 17 . The apparatus of claim 11 , wherein the one or more processors are further configured to prune the current subject layer by adjusting a pruning rate of weights of the current subject layer by updating the weight threshold value until the inference accuracy of the neural network based on the input pruning data set is decreased to a threshold accuracy. 18 . The apparatus of claim 17 , wherein the one or more processors are further configured to increase a current weight threshold value when the inference accuracy of the neural network that includes weights pruned to the current weight threshold value is not decreased to the threshold accuracy. 19 . The apparatus of claim 11 , wherein the input pruning data set comprises one of a data set generated by randomly extracting a predetermined number of data sources for each class included in a given data set, or a data set generated by selecting valid classes from the given data set and randomly extracting a predetermined number of data sources for each selected valid class. 20 . The apparatus of claim 11 , wherein the generation of the pruned neural network is performed without retraining of the pruned neural network using the input pruning data set.

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • Supervised learning · CPC title

  • G06N3/082Primary

    modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

  • Architecture, e.g. interconnection topology · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024346317A1 cover?
A method and apparatus for the pruning of a neural network is provided. The method sets a weight threshold value based on a weight distribution of layers included in a neural network, predicts a change of inference accuracy of a neural network by pruning of each layer based on the weight threshold value, determines a current subject layer to be pruned with a weight threshold value among the lay…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/082. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 17 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).