Advanced analytical infrastructure for machine learning
US-2016358099-A1 · Dec 8, 2016 · US
US12475356B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12475356-B2 |
| Application number | US-201816612361-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 17, 2018 |
| Priority date | Dec 29, 2017 |
| Publication date | Nov 18, 2025 |
| Grant date | Nov 18, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A neural network processing method, comprising the following steps: obtaining a model dataset and model structure parameters of an original network (S 100 ); obtaining an operational attribute of each compute node in the original network; operating the original network according to the model dataset and the model structure parameters of the original network and the operational attribute of each compute node, to obtain an instruction corresponding to each compute node in the original network (S 200 ); and if the operational attribute of the current compute node is a first operational attribute, storing a network weight and the instruction corresponding to the current compute node into a first non-volatile memory, so as to obtain a first offline model corresponding to the original network (S 300 ). Further provided are a computer system and a storage medium. The neural network processing method, the computer system, and the storage medium shorten the time for a processor to operate the same network, and improve the processing speed and efficiency of the processor.
Opening claim text (preview).
What is claimed is: 1 . A neural network processing method, comprising: acquiring, by an acquisition unit, a model dataset and model structure parameters of an original neural network, wherein the model dataset includes network weights corresponding to respective compute nodes of a plurality of compute nodes in the original neural network, and wherein the model structure parameters include connection values that indicate connections among the plurality of compute nodes in the original neural network; acquiring, by an operation unit, operational attributes of the respective compute nodes of the plurality of compute nodes in the original neural network, wherein the operational attributes of the respective compute nodes of the plurality of compute nodes includes a first operational attribute that indicates that instructions corresponding to the plurality of compute nodes are executable on an application specific neural network processor, and a second operational attribute that indicates that the instructions corresponding to the plurality of compute nodes are executable on a general purpose processor, wherein the application specific neural network processor serves as a coprocessor communicating with the general purpose processor, and wherein the application specific neural network processor serves as a coprocessor communicating with the general purpose processor; running, by the operation unit, the original neural network to acquire the instructions that are executable by the application specific neural network processor corresponding to the respective compute nodes of the plurality of compute nodes in the original neural network, according to the model dataset, the model structure parameters, and the operational attributes of the respective compute nodes; determining, by a control unit, that an operational attribute of a current compute node of the plurality of compute nodes is the first operational attribute; and based on the determination, storing, by the control unit, network weight and one of the instructions that are executable by the application specific neural network processor without further compilation in a first nonvolatile memory, to generate a first offline model corresponding to the original neural network, wherein, the instructions are executable by the application specific neural network processor without further compilation, the stored instruction and network weight correspond to the current compute node, the first offline model is to be run by the application specific neural network processor without compiling the original neural network, the storing of the network weight and one of the instructions for the generation of the first offline model corresponding to the original neural network includes: acquiring a memory allocation manner of the original neural network according to the model dataset and the model structure parameters of the original neural network; storing related data during running of the original neural network in a volatile memory according to the memory allocation manner, wherein the related data during the running of the original neural network includes the network weights, the instructions, input data, and output data corresponding to the respective compute nodes of the original neural network; acquiring the network weights and instructions corresponding to the respective compute nodes of the plurality of computing nodes having the first operational attribute in the original neural network from the volatile memory; storing, in the first nonvolatile memory, the network weights and the instructions corresponding to the respective compute nodes of the plurality of computing nodes having the first operational attribute in the original neural network to generate the first offline model; acquiring, the network weights and the instructions corresponding to the respective compute nodes of the plurality of computing nodes having the second operational attribute in the original neural network from the volatile memory; and storing, in a second nonvolatile memory, the network weights corresponding to the respective compute nodes of the plurality of computing nodes having the second operational attribute in the original neural network to generate a second offline model. 2 . The method of claim 1 , wherein the acquisition of the operational attributes of the respective compute nodes of the plurality of compute nodes in the original neural network includes: determining that the respective compute nodes are executable on the application specific neural network processor, based on the determination that the current compute node is executable on the application specific neural network processor, and marking the current compute node with the first operational attribute, and based on the determination that the current compute node is executable on the general purpose processor only, marking the current compute node with the second operational attribute. 3 . The method of claim 2 , the determination of whether the respective compute nodes of the plurality of compute nodes are executable on the application specific neural network processor includes: searching whether the current compute node has an equivalent compute node via a preset function table, wherein the equivalent compute node being a compute node that is executable on the application specific neural network processor, based on the determination that the current compute node has the equivalent compute node, determining that the current compute node is executable on the application specific neural network processor, and based on the determination that the current compute node does not have the equivalent compute node, determining that the current compute node is executable merely on the general purpose processor. 4 . The method of claim 2 , wherein the general purpose processor includes one or more of a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), and a field programmable gate array (FGPA), and the second operational attribute includes one or more of a CPU operational attribute, a GPU operational attribute, a DSP operational attribute, and an FGPA attribute. 5 . The method of claim 1 , wherein the acquisition of the operational attributes of the respective compute nodes of the plurality of compute nodes in the original network includes acquiring the operational attributes of the respective compute nodes of the plurality of compute nodes in the original network from the model dataset or the model structure parameters of the original network. 6 . The method of claim 1 , further comprising: making all first compute nodes among the plurality of compute nodes more than two second compute nodes of the plurality of compute nodes, which are executed in order, equivalent to a first offline node, wherein the first compute nodes are compute nodes having the first operational attribute, the two second compute nodes of the plurality of compute nodes are compute nodes having the second operational attribute, and the first offline model further includes interface data among the first offline nodes and the more than two second compute nodes. 7 . The method of claim 1 , further comprising: if the operational attribute of the current compute node is the second operational attribute, storing the network weight and the instruction corresponding to a current compute node of the plurality of compute nodes in the second nonvolatile memory to acquire a second offline model corresponding to the original neural network; wherein the second offline model can include a plurality of second offline sub-models; and wherein the second offline sub-models include instructions and network weights corresponding to compute nodes having a
Related publications grouped by family.
Answers are generated from the same data shown on this page.