Hand pose estimation
US-2020372246-A1 · Nov 26, 2020 · US
US11144790B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11144790-B2 |
| Application number | US-201916600148-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 11, 2019 |
| Priority date | Oct 11, 2019 |
| Publication date | Oct 12, 2021 |
| Grant date | Oct 12, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Presented herein are embodiments of a training deep learning models. In one or more embodiments, a compact deep learning model comprises fewer layers, which require fewer floating-point operations (FLOPs). Presented herein are also embodiments of a new learning rate function, which can adaptively change the learning rate between two linear functions. In one or more embodiments, combinations of half-precision floating point format training together with larger batch size in the training process may also be employed to aid the training process.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for training an image classification model, the method comprising: forming one or more batches comprising images and their corresponding labels, the images and their corresponding labels being selected from one or more training datasets in which each image has a corresponding label: repeating, for each training epoch until a stop condition is reached, a set of steps comprising: inputting a batch into the image classification model, the image classification model comprising: a convolution module comprising a convolution with a set of filters, a batch normalization operation, and an activation operation; a first residual module comprising at least two convolution modules separated by a max pooling layer, in which each convolution module has its own set of filters; a second residual module comprising at least two convolution modules separated by a max pooling layer, in which each convolution module has its own set of filters; and a fully connected layer that receives an input obtained from an output of the second residual module; determining a loss for the image classification model given the predicted output for the batch; and updating one or more parameters of the image classification model using the loss. 2. The computer-implemented method of claim 1 further comprising: determining a learning rate for each training epoch. 3. The computer-implemented method of claim 2 wherein the step of determining a learning rate for each training epoch comprises: using a piecewise linear function that relates training epoch number to learning rate to determine the learning rate for a training epoch. 4. The computer-implemented method of claim 3 wherein the piecewise linear function comprises: a first linear section in which learning rate increases linearly from zero or near zero to a peak point as training epoch increases; and a second linear section in which learning rate decreases linearly from a peak point to near zero as training epoch increases, wherein the magnitude of the slope of the first linear section is larger than the magnitude of the slope of the second linear section. 5. The computer-implemented method of claim 1 wherein at least one of the residual modules comprise an increasing number of filters to increase feature representation of the image classification model. 6. The computer-implemented method of claim 1 wherein at least one of the first residual module and the second residual module further comprises two convolution modules after the max pooling layer. 7. The computer-implemented method of claim 1 further wherein the number of filters for a convolution is matched to processor unit parallel capabilities of a system used to train the image classification model. 8. The computer-implemented method of claim 1 wherein the number of images selected for a batch is determined such that a memory requirement of the batch is less than a memory limit of a processor unit used to train the image classification model. 9. A system for training an image classification model, the system comprising: one or more processors; and a non-transitory computer-readable medium or media comprising one or more sets of instructions which, when executed by at least one of the one or more processors, causes steps to be performed comprising: forming one or more batches comprising images and their corresponding labels, the images and their corresponding labels being selected from one or more training datasets in which each image has a corresponding label: repeating, for each training epoch until a stop condition is reached, a set of steps comprising: inputting a batch into the image classification model, the image classification model comprising: a convolution module comprising a convolution with a set of filters, a batch normalization operation, and an activation operation; a first residual module comprising at least two convolution modules separated by a max pooling layer, in which each convolution module has its own set of filters; a second residual module comprising at least two convolution modules separated by a max pooling layer, in which each convolution module has its own set of filters; and a fully connected layer that receives an input obtained from an output of the second residual module; determining a loss for the image classification model given the predicted output for the batch; and updating one or more parameters of the image classification model using the loss. 10. The system of claim 9 wherein the non-transitory computer-readable medium or media further comprises one or more sets of instructions which, when executed by at least one of the one or more processors, causes steps to be performed comprising determining a learning rate for each training epoch. 11. The system of claim 10 wherein the step of determining a learning rate for each training epoch comprises: using a piecewise linear function that relates training epoch number to learning rate to determine the learning rate for a training epoch. 12. The system of claim 11 wherein the piecewise linear function comprises: a first linear section in which learning rate increases linearly from zero or near zero to a peak point as training epoch increases; and a second linear section in which learning rate decreases linearly from a peak point to near zero as training epoch increases, wherein the magnitude of the slope of the first linear section is larger than the magnitude of the slope of the second linear section. 13. The system of claim 9 wherein at least one of the residual modules comprise an increasing number of filters to increase feature representation of the image classification model. 14. The system of claim 9 wherein at least one of the first residual module and the second residual module further comprises two convolution modules after the max pooling layer. 15. The system of claim 9 wherein the number of images selected for a batch is determined such that a memory requirement of the batch is less than a memory limit of the at least one processor used to train the image classification model. 16. A computer-implemented method for classifying an image, the method comprising: inputting an input image into a classification model, the classification model comprising: a convolution module comprising a convolution with a set of filters, a batch normalization operation, and an activation operation; a first residual module comprising at least two convolution modules each with its own set of filters separated by a max pooling layer; a second residual module comprising at least two convolution modules separated by a max pooling layer; and a fully connected layer; and outputting a classification label for the input image. 17. The computer-implemented method of claim 16 wherein at least one of the first residual module and the second residual module further comprises two convolution modules after the max pooling layer. 18. The computer-implemented method of claim 16 wherein at least one of the first residual module and the second residual module further comprises: combining an output of the max pooling layer with an output of the last convolution module of the residual module. 19. The computer-implemented method of claim 16 further wherein at least some of the residual modules comprise an increasing number of filters to increase feature representation of the model. 20. The computer-implemented method of claim 16 wherein the number of filters for a convolutio
using neural networks · CPC title
using classification, e.g. of video objects · CPC title
Classification techniques · CPC title
Backpropagation, e.g. using gradient descent · CPC title
relating to the classification model, e.g. parametric or non-parametric approaches · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.