What technology area does this patent fall under?

Primary CPC classification G06N3/082. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Neural network layer folding

US12561566B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12561566-B2
Application number	US-202117399374-A
Country	US
Kind code	B2
Filing date	Aug 11, 2021
Priority date	Apr 7, 2021
Publication date	Feb 24, 2026
Grant date	Feb 24, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure describes neural network reduction techniques for decreasing the number of neurons or layers in a neural network. Embodiments of the method, apparatus, non-transitory computer readable medium, and system are configured to receive a trained neural network and replace certain non-linear activation units with an identity function. Next, linear blocks may then be folded to form a single block in places where the non-linear activation units were replaced by an identity function. Such techniques may reduce the number of layers in the neural network, which may optimize power and computation efficiency of the neural network architecture (e.g., without unduly influencing the accuracy of the network model).

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: obtaining, through a cloud and by a neural network design apparatus, a neural network provided by a user device and including an affine function and a non-linear activation function; replacing, by the neural network design apparatus, the non-linear activation function with a parameterized activation function that includes a target affine function and a product of a linearity parameter and the non-linear activation function; iteratively adjusting, by the neural network design apparatus, the linearity parameter of the parameterized activation function to obtain an approximately affine activation function based on an auxiliary loss term that encourages the parameterized activation function to approach the target affine function; reducing, by the neural network design apparatus, the neural network by combining the approximately affine activation function with the affine function of the neural network based on the target affine function to obtain a reduced neural network; and sending, through the cloud and by the neural network design apparatus, the reduced neural network to the user device to allow a computing device to execute the reduced neural network. 2 . The method of claim 1 , wherein: the parameterized activation function includes the non-linear activation function, an additive inverse of a product of the linearity parameter and the non-linear activation function, and a product of the linearity parameter and a target affine function. 3 . The method of claim 1 , wherein: the parameterized activation function includes further comprises a product of an additional parameter and a target affine function. 4 . The method of claim 1 , wherein: the iteratively adjusting the linearity parameter comprises selecting a value for the linearity parameter, computing the auxiliary loss term based on the selected value, and updating the value for the linearity parameter based on the auxiliary loss term. 5 . The method of claim 1 , wherein: the auxiliary loss term encourages the linearity parameter to approach a value that causes the parameterized activation function to approach a target affine function. 6 . The method of claim 1 , wherein: the combining the approximately affine activation function with the affine function of the neural network comprises combining the approximately affine activation function with a first affine function before the approximately affine activation function and a second affine function after the approximately affine activation function. 7 . The method of claim 1 , wherein: the combining the approximately affine activation function with the affine function of the neural network comprises eliminating a skip connection of the neural network. 8 . The method of claim 1 , further comprising: replacing a plurality of non-linear activation functions with a plurality of parameterized activation functions having a same linearity parameter; and combining the plurality of non-linear activation functions with a plurality of affine functions to obtain the reduced neural network. 9 . The method of claim 8 , wherein: the plurality of non-linear activation functions is bypassed by a same skip connection. 10 . The method of claim 8 , wherein: the plurality of non-linear activation functions comprises a kernel boundary of a convolutional neural network. 11 . The method of claim 1 , further comprising: refining the reduced neural network based on a loss function that does not include the auxiliary loss term. 12 . The method of claim 1 , wherein: the non-linear activation function comprises one or more rectified linear unit (ReLU) blocks and the parameterized activation function comprises one or more parametric ReLU blocks. 13 . The method of claim 1 , wherein: the neural network comprises a convolutional neural network (CNN). 14 . The method of claim 13 , wherein: the reduced neural network comprises the CNN with a reduced number of layers. 15 . A method comprising: obtaining, through a cloud and by a neural network design apparatus, a neural network provided by a user device and including an affine function and a non-linear activation function; replacing, by the neural network design apparatus, the non-linear activation function with a parameterized activation function that includes target affine function and a product of a linearity parameter and the non-linear activation function; computing, by the neural network design apparatus, an auxiliary loss term based on a value selected for the linearity parameter of the parameterized activation function, wherein the auxiliary loss term encourages the parameterized activation function to approach the target affine function; iteratively updating, by the neural network design apparatus, the value for the linearity parameter of the parameterized activation function based on the auxiliary loss term to obtain an approximately affine activation function; combining, by the neural network design apparatus, the approximately affine activation function with the affine function of the neural network to obtain a reduced neural network; and sending, through the cloud and by the neural network design apparatus, the reduced neural network to the user device to allow a computing device to execute the reduced neural network. 16 . The method of claim 15 , further comprising: refining the reduced neural network based on a loss function that does not include the auxiliary loss term. 17 . A neural network design apparatus comprising: a processor; and a memory storing instructions that, when executed by the processor, cause the neural network design apparatus to: (a) obtain, through a cloud, a neural network provided by a user device, and including an affine function and a non-linear activation function; (b) modify the neural network by replacing the non-linear activation function with a parameterized activation function that includes a target affine function and a product of a linearity parameter and the non-linear activation function; (c) iteratively adjust the linearity parameter of the parameterized activation function to obtain an approximately affine activation function based on an auxiliary loss term that encourages the parameterized activation function to approach the target affine function; (d) combine the approximately affine activation function with the affine function of the neural network based on the target affine function to obtain a reduced neural network; and (e) send, through the cloud, the reduced neural network to the user device to allow a computing device to execute the reduced neural network. 18 . The neural network design apparatus of claim 17 , wherein: the instructions, when executed by the processor, further cause the neural network design apparatus to select a value for the linearity parameter, compute the auxiliary loss term based on the selected value, and update the value for the linearity parameter based on the auxiliary loss term. 19 . The neural network design apparatus of claim 17 , wherein: the instructions, when executed by the processor, further cause the neural network design apparatus to combine the approximately affine activation function with a first affine function before the approximately affine activation function and a second affine function after the approximately affine activation function. 20 . The neural network design apparatus of claim 17 , wherein: the instructions, when executed by the processor, furt

Assignees

Samsung Electronics Co Ltd

Inventors

Classifications

G06N3/04
Architecture, e.g. interconnection topology · CPC title
G06N3/048
Activation functions · CPC title
G06N3/084
Backpropagation, e.g. using gradient descent · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/0495
Quantised networks; Sparse networks; Compressed networks · CPC title

Patent family

Related publications grouped by family.

View patent family 83509435

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12561566B2 cover?: The present disclosure describes neural network reduction techniques for decreasing the number of neurons or layers in a neural network. Embodiments of the method, apparatus, non-transitory computer readable medium, and system are configured to receive a trained neural network and replace certain non-linear activation units with an identity function. Next, linear blocks may then be folded to fo…
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06N3/082. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Neural network learning device, method, and program

Training of Photonic Neural Networks Through in situ Backpropagation

Automatic thresholds for neural network pruning and retraining

Frequently asked questions