Method, device, and computer program for operating a deep neural network
US-2022019874-A1 · Jan 20, 2022 · US
US11941507B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11941507-B2 |
| Application number | US-202217954109-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 27, 2022 |
| Priority date | Aug 10, 2022 |
| Publication date | Mar 26, 2024 |
| Grant date | Mar 26, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed are a data flow method and apparatus for neural network computation. The data flow method for neural network computation includes initializing the lifecycle of a variable in a computational graph; and defining a propagation rule for a variable in use to flow through a node. A definition of the variable is produced at a precursor node of the node, such that an input set of valid variables flowing through the node contains the variable. The method may be used on neural network computation in a deep learning training system.
Opening claim text (preview).
What is claimed is: 1. A data flow method for neural network computation, comprising the following steps: step 1, initializing a lifecycle of a variable in a computational graph, which comprises initializing a time period from a start of a definition of the variable to an end of use as the lifecycle of the variable in the computational graph; step 2, defining a propagation rule for a variable in use to flow through a node in the computational graph, which comprises defining that, when the variable at the node in the computational graph is used, a definition of the variable is produced at a precursor node of the node, such that an input set of valid variables flowing through the node contains the variable; step 3, designing the propagation rule for a redefined variable to flow through the node, which comprises, when the variable is redefined at the node in the computational graph, ending the lifecycle of the variable at the precursor node of the node while the variable flows through the node; step 4, defining the propagation rule for an associated variable to flow through the node in the computational graph; step 5, analyzing valid variables input and output at each node in the computation graph based on a data stream; wherein the step 5 comprises the following specific sub-steps step 5.1, initializing a set of input valid variables of the each node, wherein for the each node in the computational graph, a set of output valid variables is an empty set, and the set of input valid variables is derived by removing variables redefined at the each node from the set of output valid variables, and taking a union with all variables used at the each node; step 5.2, initializing the set of output valid variables of the each node, which comprises, for the each node in the computational graph, initializing elements of the set of output valid variables as variables defined at the each node; step 5.3, deriving a set of output valid variables of an intermediate node in the computational graph, wherein the set of output valid variables of the intermediate node is obtained by taking a union of input valid variables of all successor nodes of the intermediate node; and step 5.4, deriving a set of input valid variables of the intermediate node, wherein the set of input invalid variables at the intermediate node is obtained by removing variables redefined at the intermediate node from the set of output valid variables of the intermediate node, and taking a union with variables used at the intermediate node; step 6, collecting a set of valid variables before and after flowing through the node, and collecting the set of valid variables flowing through the each node obtained by analysis based on lifecycles of variables in the computational graph; step 7, allocating memory cell blocks for valid variables on edges of the computational graph; wherein in the step 7, conditions for allocating the memory cell blocks for the variable at the certain node in the computational graph are defined as follows: the memory cell blocks are only allocated for the valid variables on the edges of the computational graph during the lifecycle of the variable at the node, and during compilation of the computational graph, the memory cell blocks are allocated for each variable in advance according to a number of variables in the set of valid variables collected; step 8, defining the propagation rule for available expressions in the computational graph; step 9, analyzing the available expressions input and output at the each node based on the data stream; and step 10, optimizing the available expressions in the computational graph, which comprises saving computation results of the available expressions at the nodes in the computational graph into intermediate variables, and replacing the available expressions appearing again in a successor node with the intermediate variables; configuring a computer system according to the computational graph such that the computer system implements the neural network. 2. The method according to claim 1 , wherein the available expressions of the each node is determined as: a set difference of: (A) a union of (a) an intersection of the sets of available expressions of all precursor nodes of the each node, and (b) newly appearing expressions in the each node; and (B) any available expressions containing any variable defined at the each node. 3. An apparatus, comprising a non-transitory memory and one or more processors, wherein the non-transitory memory has instructions recorded therein, the instructions when executed by the one or more processors implementing the method according to claim 1 .
Architecture, e.g. interconnection topology · CPC title
Graphical models, e.g. Bayesian networks · CPC title
the resource being the memory · CPC title
Learning methods · CPC title
Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.