Granular neural network architecture search over low-level primitives
US-2024428071-A1 · Dec 26, 2024 · US
US2024346312A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2024346312-A1 |
| Application number | US-202418750655-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 21, 2024 |
| Priority date | Dec 22, 2021 |
| Publication date | Oct 17, 2024 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An electronic apparatus may include a memory configured to store data related to a neural network model and at least one processor configured to divide a learning step performed through a plurality of layers of the neural network model into a plurality of steps including a forward propagation step, a gradient calculation step, and a derivative calculation step, and determine an execution order of the plurality of steps, obtain first information regarding in which step of a plurality of steps according to the determined execution order a plurality of sensors used in the plurality of layers are used, based on the determined execution order, integrate the determined execution order based on the first information and second information regarding whether tensors used in neighboring layers from among the plurality of layers are able to be shared, allocate the data to the plurality of tensors by minimizing a region of the memory for allocating data corresponding to the plurality of tensors, based on the integrated execution order, and train the neural network model according to the integrated execution order using the plurality of tensors and the data allocated to the plurality of tensors. Various other embodiments are possible to be implemented.
Opening claim text (preview).
What is claimed is: 1 . An electronic apparatus comprising: a memory configured to store data related to a neural network model; and at least one processor, comprising processing circuitry, individually and/or collectively configured to: divide a learning step performed through a plurality of layers of the neural network model into a plurality of steps including a forward propagation step, a gradient calculation step, and a derivative calculation step, and determine an execution order of the plurality of steps; obtain first information regarding in which step of the plurality of steps according to the determined execution order a plurality of tensors used in the plurality of layers are used, based on the determined execution order; integrate the determined execution order based on the first information and second information regarding whether tensors used in neighboring layers from among the plurality of layers are able to be shared; allocate the data to the plurality of tensors by reducing and/or minimizing a region of the memory for allocating data corresponding to the plurality of tensors, based on the integrated execution order; and train the neural network model according to the integrated execution order using the plurality of tensors and the data allocated to the plurality of tensors. 2 . The apparatus as claimed in claim 1 , wherein the first information is based on information regarding a type of step in which the plurality of tensors are used, from among the plurality of steps. 3 . The apparatus as claimed in claim 1 , wherein the type of step in which the plurality of tensors are used comprises types indicating each of the forward propagation step, the gradient calculation step, the derivative calculation step, a backpropagation step including the gradient calculation step and the derivative calculation step, a step including the forward propagation step and the backpropagation step, and an overall learning step of the neural network model. 4 . The apparatus as claimed in claim 1 , wherein the second information comprises first mode information indicating that tensors are in a pre-allocated state, second mode information indicating that a tensor needs to be newly created, third mode information indicating that data of a tensor is changed but the tensor is able to be shared with another tensor in a neighboring layer, fourth mode information indicating that data of a tensor is unchanged and the tensor is able to be shared with the another tensor, and fifth mode information indicating that a tensor is able to be shared with all tensors. 5 . The apparatus as claimed in claim 4 , wherein the at least one processor is individually and/or collectively configured to, based on an execution order of a step in which a first tensor from among the plurality of tensors is last used being equal to or faster than an execution order in which a second tensor of a layer adjacent to a layer of the first tensor is the first to first used, integrate at least a portion of the determined execution order so that the first tensor and the second tensor are shared. 6 . The apparatus as claimed in claim 5 , wherein the at least one processor is individually and/or collectively configured to, based on an execution order of a step in which a first tensor from among the plurality of tensors is last used being slower than an execution order of a step in which a second tensor of a layer adjacent to a layer of the first tensor is first used, if second information corresponding to the second tensor is the fourth mode information, integrate at least a portion of the determined execution order so that the first tensor and the second tensor are shared. 7 . The apparatus as claimed in claim 1 , wherein the at least one processor is individually and/or collectively configured to reduce and/or minimize a region of the memory by determining whether to further create a region of a memory for allocating data corresponding to the plurality of tensors or to overwrite a previously created region of memory, based on the integrated execution order. 8 . A controlling method of an electronic apparatus, the method comprising: dividing a learning step performed through a plurality of layers of a neural network model into a plurality of steps including a forward propagation step, a gradient calculation step, and a derivative calculation step, and determining an execution order of the plurality of steps; obtaining first information regarding in which step of a plurality of steps according to the determined execution order a plurality of tensors used in the plurality of layers are to be used, based on the determined execution order; integrating the determined execution order based on the first information and second information regarding whether tensors used in neighboring layers from among the plurality of layers are able to be shared; allocating the data to the plurality of tensors by reducing and/or minimizing a region of the memory for allocating data corresponding to the plurality of tensors, based on the integrated execution order; and training the neural network model according to the integrated execution order using the plurality of tensors and the data allocated to the plurality of tensors. 9 . The method as claimed in claim 8 , wherein the first information is determined based on information regarding a type of step in which the plurality of tensors are used, from among the plurality of steps. 10 . The method as claimed in claim 8 , wherein the type of step in which the plurality of tensors are used comprises types indicating each of the forward propagation step, the gradient calculation step, the derivative calculation step, a backpropagation step including the gradient calculation step and the derivative calculation step, a step including the forward propagation step and the backpropagation step, and an overall learning step of the neural network model. 11 . The method as claimed in claim 8 , wherein the second information comprises first mode information indicating that tensors are in a pre-allocated state, second mode information indicating that a tensor needs to be newly created, third mode information indicating that data of a tensor is changed but the tensor is able to be shared with another tensor in a neighboring layer, fourth mode information indicating that data of a tensor is unchanged and the tensor is able to be shared with the another tensor, and fifth mode information indicating that a tensor is able to be shared with all tensors. 12 . The method as claimed in claim 11 , wherein the integrating the determined execution order comprises, based on an execution order of a step in which a first tensor from among the plurality of tensors is last used being equal to or faster than an execution order in which a second tensor of a layer adjacent to a layer of the first tensor is first used, integrating at least a portion of the determined execution order so that the first tensor and the second tensor are shared. 13 . The method as claimed in claim 12 , wherein the integrating the determined execution order comprises, based on an execution order of a step in which a first tensor from among the plurality of tensors is last used being slower than an execution order of a step in which a second tensor of a layer adjacent to a layer of the first tensor is first used, if second information corresponding to the second tensor is the fourth mode information, integrating at least a portion of the determined execution order so that the first tensor and the second tensor are shared. 14 . The method as claimed in claim 8 , wherein the allocating the data to
Related publications grouped by family.
Answers are generated from the same data shown on this page.