Sparse convolutional neural network accelerator
US-10891538-B2 · Jan 12, 2021 · US
US11580361B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11580361-B2 |
| Application number | US-201715494826-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 24, 2017 |
| Priority date | Apr 24, 2017 |
| Publication date | Feb 14, 2023 |
| Grant date | Feb 14, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus to facilitate neural network (NN) training is disclosed. The apparatus includes training logic to receive one or more network constraints and train the NN by automatically determining a best network layout and parameters based on the network constraints.
Opening claim text (preview).
What is claimed is: 1. An apparatus to facilitate neural network training, comprising: a memory to store data, including data for neural network training; and one or more processors including a graphics processing unit (GPU) to perform neural network operations including performing training to generate a trained neural network, the training including operation of a layout controller to determine a neural network layout, a parameter precision controller to determine parameter precision values, and a training parameter controller to determine training parameters; wherein performing training of a neural network includes the GPU to: receive a set of training data and a set of validation data for the training of the neural network; receive a set of one or more network constraints to be applied in processing for the neural network, the set of one or more network constraints being received at the layout controller and the parameter precision controller, the one or more network constraints including memory and compute budget constraints to be applied in processing for operation of the neural network; determine initial data points for the neural network and initial parameter precision values for nodes within the neural network, the determination of the initial data points and the initial parameter precision values being based at least in part on the one or more network constraints, the layout controller to determine the initial data points for the neural network and the parameter precision controller to determine the initial parameter precision values, the initial data points including initial network parameters to be applied in a network layout of the neural network; perform training of the neural network utilizing a training routine based at least in part on the received set of training data, the initial data points for the neural network layout, and training parameters that are generated for the neural network, wherein the training includes updating the initial data points based on accuracy data generated for the neural network training; adjust the parameter precision values for the neural network utilizing at least the one or more network constraints and the accuracy data generated for the neural network training; and generate a trained neural network including a best network layout and set of network parameters, including enforcing the one or more network constraints in the generation of the trained neural network. 2. The apparatus of claim 1 , wherein performing neural network training includes the GPU to determine a neural network model that has a highest accuracy for an inference task while adhering to the one or more network constraints. 3. The apparatus of claim 1 , wherein performing training further includes the GPU to: generate the accuracy data based on the generated set of network parameters and on the received set of validation data. 4. The apparatus of claim 1 , wherein the set of network parameters comprise one or more of network depth, number of nodes in each layer, convolution dimensions, stride and padding numbers, activation functions at each layer, and pooling layer properties. 5. The apparatus of claim 1 , wherein the GPU is further to compress data bits representing weights associated with connections in the neural network. 6. The apparatus of claim 5 , wherein compressing the data bits further includes the GPU to execute an operation to load/store consecutive bit values. 7. A method to facilitate neural network training, comprising: receiving a set of training data and a set of validation data for training of a neural network in a system, the training including operation of a layout controller to determine a neural network layout, a parameter precision controller to determine parameter precision values, and a training parameter controller to determine training parameters; receiving a set of one or more network constraints to be applied in processing for the neural network, the set of one or more network constraints being received at the layout controller and the parameter precision controller, the one or more network constraints including memory and compute budget constraints to be applied in processing for operation of the neural network; determining initial data points for the neural network and initial parameter precision values for nodes within the neural network, the determination of the initial data points and the initial parameter precision values being based at least in part on the one or more network constraints, the layout controller to determine the initial data points for the neural network and the parameter precision controller to determine the initial parameter precision values, the initial data points including initial network parameters to be applied in a network layout of the neural network; performing training of the neural network utilizing a training routine based at least in part on the received set of training data, the initial data points for the neural network layout, and training parameters that are generated for the neural network, wherein the training includes updating the initial data points based on accuracy data generated for the neural network training; adjusting the parameter precision values for the neural network utilizing at least the one or more network constraints and the accuracy data generated for the neural network training; and generating a trained neural network including a best network layout and network parameters, including enforcing the one or more network constraints in the generation of the trained neural network. 8. The method of claim 7 , wherein performing training of the neural network further comprises: generating the accuracy data based on the generated set of network parameters and on the received set of validation data. 9. At least one non-transitory computer readable medium having instructions, which when executed by one or more processors, cause the processors to: receive a set of training data and a set of validation data for training of a neural network in a system, the training including operation of a layout controller to determine a neural network layout, a parameter precision controller to determine parameter precision values, and a training parameter controller to determine training parameters; receive a set of one or more network constraints to be applied in processing for the neural network, the set of one or more network constraints being received at the layout controller and the parameter precision controller, the one or more network constraints including memory and compute budget constraints to be applied in processing for operation of the neural network; determine initial data points for the neural network and initial parameter precision values for nodes within the neural network, the determination of the initial data points and the initial parameter precision values being based at least in part on the one or more network constraints, the layout controller to determine the initial data points for the neural network and the parameter precision controller to determine the initial parameter precision values, the initial data points including initial network parameters to be applied in a network layout of the neural network; perform training of the neural network utilizing a training routine based at least in part on the received set of training data, the initial data points for the neural network layout, and training parameters that are generated for the neural network, wherein the training includes updating the initial data points based on accuracy data generated for the neural network training; adjusting the parameter precision values for the neural network utilizing at least the one or more network constraints and the accuracy data generated
Learning methods · CPC title
Distributed learning, e.g. federated learning · CPC title
Supervised learning · CPC title
Quantised networks; Sparse networks; Compressed networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.