What technology area does this patent fall under?

Primary CPC classification G06N3/045. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 18 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Neural network processing method, computer system and storage medium

US12475356B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12475356-B2
Application number	US-201816612361-A
Country	US
Kind code	B2
Filing date	Dec 17, 2018
Priority date	Dec 29, 2017
Publication date	Nov 18, 2025
Grant date	Nov 18, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A neural network processing method, comprising the following steps: obtaining a model dataset and model structure parameters of an original network (S 100 ); obtaining an operational attribute of each compute node in the original network; operating the original network according to the model dataset and the model structure parameters of the original network and the operational attribute of each compute node, to obtain an instruction corresponding to each compute node in the original network (S 200 ); and if the operational attribute of the current compute node is a first operational attribute, storing a network weight and the instruction corresponding to the current compute node into a first non-volatile memory, so as to obtain a first offline model corresponding to the original network (S 300 ). Further provided are a computer system and a storage medium. The neural network processing method, the computer system, and the storage medium shorten the time for a processor to operate the same network, and improve the processing speed and efficiency of the processor.

First claim

Opening claim text (preview).

What is claimed is: 1 . A neural network processing method, comprising: acquiring, by an acquisition unit, a model dataset and model structure parameters of an original neural network, wherein the model dataset includes network weights corresponding to respective compute nodes of a plurality of compute nodes in the original neural network, and wherein the model structure parameters include connection values that indicate connections among the plurality of compute nodes in the original neural network; acquiring, by an operation unit, operational attributes of the respective compute nodes of the plurality of compute nodes in the original neural network, wherein the operational attributes of the respective compute nodes of the plurality of compute nodes includes a first operational attribute that indicates that instructions corresponding to the plurality of compute nodes are executable on an application specific neural network processor, and a second operational attribute that indicates that the instructions corresponding to the plurality of compute nodes are executable on a general purpose processor, wherein the application specific neural network processor serves as a coprocessor communicating with the general purpose processor, and wherein the application specific neural network processor serves as a coprocessor communicating with the general purpose processor; running, by the operation unit, the original neural network to acquire the instructions that are executable by the application specific neural network processor corresponding to the respective compute nodes of the plurality of compute nodes in the original neural network, according to the model dataset, the model structure parameters, and the operational attributes of the respective compute nodes; determining, by a control unit, that an operational attribute of a current compute node of the plurality of compute nodes is the first operational attribute; and based on the determination, storing, by the control unit, network weight and one of the instructions that are executable by the application specific neural network processor without further compilation in a first nonvolatile memory, to generate a first offline model corresponding to the original neural network, wherein, the instructions are executable by the application specific neural network processor without further compilation, the stored instruction and network weight correspond to the current compute node, the first offline model is to be run by the application specific neural network processor without compiling the original neural network, the storing of the network weight and one of the instructions for the generation of the first offline model corresponding to the original neural network includes: acquiring a memory allocation manner of the original neural network according to the model dataset and the model structure parameters of the original neural network; storing related data during running of the original neural network in a volatile memory according to the memory allocation manner, wherein the related data during the running of the original neural network includes the network weights, the instructions, input data, and output data corresponding to the respective compute nodes of the original neural network; acquiring the network weights and instructions corresponding to the respective compute nodes of the plurality of computing nodes having the first operational attribute in the original neural network from the volatile memory; storing, in the first nonvolatile memory, the network weights and the instructions corresponding to the respective compute nodes of the plurality of computing nodes having the first operational attribute in the original neural network to generate the first offline model; acquiring, the network weights and the instructions corresponding to the respective compute nodes of the plurality of computing nodes having the second operational attribute in the original neural network from the volatile memory; and storing, in a second nonvolatile memory, the network weights corresponding to the respective compute nodes of the plurality of computing nodes having the second operational attribute in the original neural network to generate a second offline model. 2 . The method of claim 1 , wherein the acquisition of the operational attributes of the respective compute nodes of the plurality of compute nodes in the original neural network includes: determining that the respective compute nodes are executable on the application specific neural network processor, based on the determination that the current compute node is executable on the application specific neural network processor, and marking the current compute node with the first operational attribute, and based on the determination that the current compute node is executable on the general purpose processor only, marking the current compute node with the second operational attribute. 3 . The method of claim 2 , the determination of whether the respective compute nodes of the plurality of compute nodes are executable on the application specific neural network processor includes: searching whether the current compute node has an equivalent compute node via a preset function table, wherein the equivalent compute node being a compute node that is executable on the application specific neural network processor, based on the determination that the current compute node has the equivalent compute node, determining that the current compute node is executable on the application specific neural network processor, and based on the determination that the current compute node does not have the equivalent compute node, determining that the current compute node is executable merely on the general purpose processor. 4 . The method of claim 2 , wherein the general purpose processor includes one or more of a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), and a field programmable gate array (FGPA), and the second operational attribute includes one or more of a CPU operational attribute, a GPU operational attribute, a DSP operational attribute, and an FGPA attribute. 5 . The method of claim 1 , wherein the acquisition of the operational attributes of the respective compute nodes of the plurality of compute nodes in the original network includes acquiring the operational attributes of the respective compute nodes of the plurality of compute nodes in the original network from the model dataset or the model structure parameters of the original network. 6 . The method of claim 1 , further comprising: making all first compute nodes among the plurality of compute nodes more than two second compute nodes of the plurality of compute nodes, which are executed in order, equivalent to a first offline node, wherein the first compute nodes are compute nodes having the first operational attribute, the two second compute nodes of the plurality of compute nodes are compute nodes having the second operational attribute, and the first offline model further includes interface data among the first offline nodes and the more than two second compute nodes. 7 . The method of claim 1 , further comprising: if the operational attribute of the current compute node is the second operational attribute, storing the network weight and the instruction corresponding to a current compute node of the plurality of compute nodes in the second nonvolatile memory to acquire a second offline model corresponding to the original neural network; wherein the second offline model can include a plurality of second offline sub-models; and wherein the second offline sub-models include instructions and network weights corresponding to compute nodes having a

Assignees

Cambricon Tech Corp Ltd

Inventors

Classifications

G06N3/08
Learning methods · CPC title
G06N3/04
Architecture, e.g. interconnection topology · CPC title
G06N3/063
using electronic means · CPC title
G06N3/045Primary
Combinations of networks · CPC title
G06N3/065Primary
Analogue means · CPC title

Patent family

Related publications grouped by family.

View patent family 67066560

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12475356B2 cover?: A neural network processing method, comprising the following steps: obtaining a model dataset and model structure parameters of an original network (S 100 ); obtaining an operational attribute of each compute node in the original network; operating the original network according to the model dataset and the model structure parameters of the original network and the operational attribute of each…
Who is the assignee on this patent?: Cambricon Tech Corp Ltd
What technology area does this patent fall under?: Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 18 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).