Neural network model processing method and apparatus

US12333428B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12333428-B2
Application numberUS-201917434563-A
CountryUS
Kind codeB2
Filing dateFeb 27, 2019
Priority dateFeb 27, 2019
Publication dateJun 17, 2025
Grant dateJun 17, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A neural network model processing method includes obtaining a first low-bit neural network model through training, where the model includes a first operation layer and a second operation layer. Each operation layer includes at least one operation. Values/a value of a parameter and/or data used for the operation are/is represented by using N bits, and N is a positive integer less than 8. The neural network model processing method further includes compressing the model to obtain a second low-bit neural network model, where the compressed model includes a third operation layer. The third operation layer is equivalent to the first operation layer and the second operation layer, and an operation layer other than the third operation layer in the at least one operation layer is the same as an operation layer other than the first operation layer and the second operation layer in the at least two operation layers.

First claim

Opening claim text (preview).

What is claimed is: 1. A neural network model processing method, the neural network model processing method comprising: obtaining a first low-bit neural network model through training, wherein the first low-bit neural network model comprises at least three operation layers, wherein the at least three operation layers comprise a first operation layer and a second operation layer, wherein each of the at least three operation layers comprises at least one operation, wherein one or more values of one or more of a parameter or data used for the at least one operation are represented using N bits, and wherein N is a positive integer less than eight; compressing the first low-bit neural network model to obtain a second low-bit neural network model, wherein the second low-bit neural network model comprises at least two operation layers, wherein the at least two operation layers comprise a third operation layer, wherein the third operation layer is equivalent to a combination of the first operation layer and the second operation layer, and wherein an operation layer other than the third operation layer in the at least two operation layers is the same as an operation layer other than the first operation layer and the second operation layer in the at least three operation layers; searching the at least three operation layers for the first operation layer and the second operation layer; combining the first operation layer and the second operation layer to obtain the third operation layer, wherein an input of the first operation layer is the same as an input of the third operation layer, wherein an output of the first operation layer is an input of the second operation layer, and wherein an output of the second operation layer is the same as an output of the third operation layer; and constructing the second low-bit neural network model based on the third operation layer and the operation layer other than the first operation layer and the second operation layer in the at least three operation layers. 2. The neural network model processing method of claim 1 , wherein the first operation layer comprises at least one first operation, wherein the second operation layer comprises at least one second operation, and wherein the neural network model processing method further comprises: combining the at least one first operation and the at least one second operation according to a preset rule to obtain at least one third operation; and constructing the third operation layer based on the at least one third operation, wherein the third operation layer comprises the at least one third operation. 3. The neural network model processing method of claim 1 , wherein the first operation layer comprises at least one first operation, wherein the second operation layer comprises at least one second operation, and wherein the neural network model processing method further comprises constructing the third operation layer based on the at least one first operation and the at least one second operation, wherein the third operation layer comprises the at least one first operation and the at least one second operation. 4. The neural network model processing method of claim 1 , further comprising storing the second low-bit neural network model. 5. The neural network model processing method of claim 1 , further comprising sending the second low-bit neural network model to a terminal device. 6. The neural network model processing method of claim 1 , wherein the neural network model processing method is performed by a terminal device. 7. The neural network model processing method of claim 1 , wherein the neural network model processing method is performed by a server. 8. A neural network model processing method performed by a terminal device, the neural network model processing method comprising: obtaining a second low-bit neural network model; and updating a first low-bit neural network model to the second low-bit neural network model, wherein the first low-bit neural network model comprises at least three operation layers, wherein the at least three operation layers comprise a first operation layer and a second operation layer, wherein each of the at least three operation layers comprises at least one operation, wherein one or more values of a parameter or data used for the at least one operation are represented using N bits, wherein N is a positive integer less than eight, wherein the second low-bit neural network model comprises at least two operation layers, wherein the at least two operation layers comprise a third operation layer, wherein the third operation layer is equivalent to a combination of the first operation layer and the second operation layer, wherein an operation layer other than the third operation layer in the at least two operation layers is the same as an operation layer other than the first operation layer and the second operation layer in the at least three operation layers, wherein the second low-bit neural network model is based on the third operation layer and the operation layer other than the first operation layer and the second operation layer in the at least three operation layers, wherein the third operation layer is an operation layer obtained by combining the first operation layer and the second operation layer, wherein an input of the first operation layer is the same as an input of the third operation layer, wherein an output of the first operation layer is an input of the second operation layer, and wherein an output of the second operation layer is the same as an output of the third operation layer. 9. The neural network model processing method of claim 8 , wherein the first operation layer comprises at least one first operation, wherein the second operation layer comprises at least one second operation, wherein the third operation layer comprises at least one third operation, and wherein the at least one third operation is obtained by combining the at least one first operation and the at least one second operation according to a preset rule. 10. The neural network model processing method of claim 8 , wherein the first operation layer comprises at least one first operation, wherein the second operation layer comprises at least one second operation, and wherein the third operation layer comprises the at least one first operation and the at least one second operation. 11. The neural network model processing method of claim 8 , further comprising receiving the second low-bit neural network model from a server. 12. The neural network model processing method of claim 8 , further comprising locally obtaining the second low-bit neural network model. 13. A neural network model processing system, comprising: a terminal device; and a server communicatively coupled to the terminal device and configured to: obtain a first low-bit neural network model through training, wherein the first low-bit neural network model comprises at least three operation layers, wherein the at least three operation layers comprise a first operation layer and a second operation layer, wherein each of the at least three operation layers comprises at least one operation, wherein one or more values of one or more of a parameter or data used for the at least one operation are represented using N bits, and wherein N is a positive integer less than eight; compress the first low-bit neural network model to obtain a second low-bit neural network model, wherein the second low-bit neural network model comprises at least two operation layers, wherein the at least two operation layers comprise a third operation layer, wherein the third operation layer is equivalent to a combination of the f

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • G06N3/04Primary

    Architecture, e.g. interconnection topology · CPC title

  • Combinations of networks · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12333428B2 cover?
A neural network model processing method includes obtaining a first low-bit neural network model through training, where the model includes a first operation layer and a second operation layer. Each operation layer includes at least one operation. Values/a value of a parameter and/or data used for the operation are/is represented by using N bits, and N is a positive integer less than 8. The neu…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 17 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).