Who is the assignee on this patent?

Inspur Suzhou Intelligent Technology Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06N3/0895. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 23 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Neural network compression method, apparatus and device, and storage medium

US12045729B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12045729-B2
Application number	US-202118005620-A
Country	US
Kind code	B2
Filing date	Jan 25, 2021
Priority date	Aug 6, 2020
Publication date	Jul 23, 2024
Grant date	Jul 23, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A neural network compression method whereby forward inference is performed on target data by using a target parameter sharing network to obtain an output feature map of the last convolutional module, a channel related feature is extracted from the output feature map, the extracted channel related feature and a target constraint condition are input into a target meta-generative network, and an optimal network architecture under the target constraint condition is predicted by using the target meta-generative network to obtain a compressed neural network model.

First claim

Opening claim text (preview).

What is claimed is: 1. A neural network compression method performed by a neural network compression device and used for implementing fast neural network compression with reduced computation load on the neural network compression device, comprising: performing forward inference on target data by using a pre-trained target parameter sharing network to obtain an output feature map of a last convolutional module of the pre-trained target parameter sharing network; extracting a channel related feature from the output feature map of the last convolutional module of the pre-trained target parameter sharing network; inputting the extracted channel related feature and a target constraint condition into a target meta-generative network of a pre-trained target weakly supervised meta-learning framework; and predicting an optimal network architecture under the target constraint condition by using the target meta-generative network to obtain a compressed neural network model. 2. The method according to claim 1 , wherein the pre-trained target weakly supervised meta-learning framework comprises the target meta-generative network and a target meta-evaluation network connected with the target meta-generative network; and supervised information of the target meta-generative network is from gradient information of the target meta-evaluation network. 3. The method according to claim 2 , wherein the pre-trained target parameter sharing network and the pre-trained target weakly supervised meta-learning framework are obtained by the following operations: determining a target neural network model and an initial weakly supervised meta-learning framework, wherein the initial weakly supervised meta-learning framework comprises an initial meta-evaluation network and an initial meta-generative network; controlling the target neural network model to perform learning at a training stage; controlling the initial meta-evaluation network and the initial meta-generative network to perform learning at a validation stage; and repeatedly performing the operations of controlling the target neural network model to perform learning at the training stage and controlling the initial meta-evaluation network and the initial meta-generative network to perform learning at the validation stage until a set first end condition is satisfied, so as to obtain the pre-trained target parameter sharing network and the pre-trained target weakly supervised meta-learning framework. 4. The method according to claim 2 , wherein the pre-trained target parameter sharing network and the pre-trained target weakly supervised meta-learning framework are obtained by the following operations: determining a target neural network model and an initial weakly supervised meta-learning framework, wherein the initial weakly supervised meta-learning framework comprises an initial meta-evaluation network and an initial meta-generative network; performing parameter sharing training on the target neural network model to obtain the pre-trained target parameter sharing network; controlling the initial meta-evaluation network and the initial meta-generative network to perform learning at a validation stage; and repeatedly performing the operation of controlling the initial meta-evaluation network and the initial meta-generative network to perform learning at the validation stage until a set second end condition is satisfied, so as to obtain the pre-trained target weakly supervised meta-learning framework. 5. The method according to claim 3 , wherein the initial meta-evaluation network is controlled to perform learning at the validation stage by the following operations: generating a set of initial neural network architecture; predicting a weight parameter of a last convolutional module of the target neural network model by using the initial meta-evaluation network according to the initial neural network architecture; constructing a replacement convolutional module for the last convolutional module of the target neural network model by using the initial meta-evaluation network, wherein the replacement convolutional module takes a weight parameter predicted by the initial meta-evaluation network as a weight and takes input data of the last convolutional module of the target neural network model as an input; determining a loss function using an output feature map of the replacement convolutional module; and calculating a gradient according to the loss function by using the initial meta-evaluation network, and performing parameter update. 6. The method according to claim 5 , wherein determining the loss function using the output feature map of the replacement convolutional module comprises: inputting the output feature map of the replacement convolutional module into a classifier of the target neural network model to obtain a classification error; calculating a mean square error between the output feature map of the replacement convolutional module and an output feature map of the last convolutional module of the target neural network model; and determining the loss function according to the classification error and the mean square error. 7. The method according to claim 3 , wherein the initial meta-generative network is controlled to perform learning at the validation stage by the following operations: performing forward inference by using the target neural network model to obtain an output feature map of a last convolutional module of the target neural network model; extracting a channel related feature from the output feature map of the last convolutional module of the target neural network model; inputting the extracted channel related feature and a current constraint condition into the initial meta-generative network; predicting an optimal network architecture under the current constraint condition by using the initial meta-generative network, and inputting the optimal network architecture into the initial meta-evaluation network; and acquiring a loss function of the optimal network architecture under the current constraint condition by using the initial meta-evaluation network and backward transferring gradient information so that the initial meta-generative network performs gradient computation and parameter update on parameters of the initial meta-generative network based on the gradient information. 8. The method according to claim 2 , wherein a network architecture of each of the target meta-evaluation network and the target meta-generative network contains two fully-connected layers, and an input layer of the target meta-generative network and an output layer of the target meta-evaluation network adopt a parameter sharing mechanism. 9. The method according to claim 4 , wherein the initial meta-evaluation network is controlled to perform learning at the validation stage by the following operations: generating a set of initial neural network architecture; predicting a weight parameter of a last convolutional module of the target neural network model by using the initial meta-evaluation network according to the initial neural network architecture; constructing a replacement convolutional module for the last convolutional module of the target neural network model by using the initial meta-evaluation network, wherein the replacement convolutional module takes a weight parameter predicted by the initial meta-evaluation network as a weight and takes input data of the last convolutional module of the target neural network model as an input; determining a loss function using an output feature map of the replacement convolutional module; and calculating a gradient according to the loss function by using the initial meta-evaluation network, and performing parameter update.

Assignees

Inspur Suzhou Intelligent Technology Co Ltd

Inventors

Classifications

G06N3/0895Primary
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
G06N3/0475
Generative networks · CPC title
G06N3/0495
Quantised networks; Sparse networks; Compressed networks · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/0985Primary
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

Patent family

Related publications grouped by family.

View patent family 73365042

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12045729B2 cover?: A neural network compression method whereby forward inference is performed on target data by using a target parameter sharing network to obtain an output feature map of the last convolutional module, a channel related feature is extracted from the output feature map, the extracted channel related feature and a target constraint condition are input into a target meta-generative network, and an o…
Who is the assignee on this patent?: Inspur Suzhou Intelligent Technology Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06N3/0895. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 23 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Meta-learning for multi-task learning for neural networks

Neural network compression via weak supervision

Machine learning method and apparatus based on weakly supervised learning

Frequently asked questions