Training a neural network using augmented training datasets
US-2019258901-A1 · Aug 22, 2019 · US
US10565475B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10565475-B2 |
| Application number | US-201815961392-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 24, 2018 |
| Priority date | Apr 24, 2018 |
| Publication date | Feb 18, 2020 |
| Grant date | Feb 18, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A device receives images of a video stream, models for objects in the images, and physical property data for the objects, and maps the models and the physical property data to the objects in the images to generate augmented data sequences. The device applies different physical properties to the objects in the augmented data sequences to generate augmented data sequences with different applied physical properties, and trains a machine learning (ML) model based on the images to generate a first trained ML model. The device trains the ML model, based on the augmented data sequences with the different applied physical properties, to generate a second trained ML model, and compares the first trained ML model and the second trained ML model. The device determines whether the second trained ML model is optimized based on the comparison, and provides the second trained ML model when optimized.
Opening claim text (preview).
What is claimed is: 1. A device, comprising: one or more memories; and one or more processors, communicatively coupled to the one or more memories, to: receive images of a video stream, three-dimensional models for objects in the images, and physical property data for the objects; map the three-dimensional models and the physical property data to the objects in the images to generate augmented data sequences with the objects; apply different physical properties, of the physical property data, to the objects in the augmented data sequences, based on an augmentation policy, to generate augmented data sequences with different applied physical properties; train a machine learning model based on the images of the video stream to generate a first trained machine learning model; train the machine learning model, based on the augmented data sequences with the different applied physical properties, to generate a second trained machine learning model; compare the first trained machine learning model and the second trained machine learning model; determine whether the second trained machine learning model is optimized based on a result of comparing the first trained machine learning model and the second trained machine learning model; and provide the second trained machine learning model and the different applied physical properties when the second trained machine learning model is optimized. 2. The device of claim 1 , wherein the one or more processors are further to: modify the different applied physical properties when the second trained machine learning model is not optimized; retrain the machine learning model, based on the modified different applied physical properties, to generate the second trained machine learning model; and repeat the modifying the different applied physical properties and the retraining until the second trained machine learning model is optimized. 3. The device of claim 1 , wherein the one or more processors are further to: utilize the second trained machine learning model and the different applied physical properties, when the second trained machine learning model is optimized, to predict an unknown object. 4. The device of claim 1 , wherein the one or more processors are further to: receive the machine learning model and the augmentation policy, wherein the augmentation policy includes information indicating how the different physical properties are to be applied to each of the augmented data sequences. 5. The device of claim 1 , wherein the machine learning model includes one or more of: a single shot multibox detector (SSD) model, a region-based fully convolutional network (R-FCN) model, a region-based convolution network (R-CNN) model, a fast R-CNN model, or a faster R-CNN model. 6. The device of claim 1 , wherein the one or more processors are further to: modify the different applied physical properties, when the second trained machine learning model is not optimized, based on a hyperparameter optimization technique, wherein the hyperparameter optimization technique includes one or more of: a grid search technique, a random search technique, a Bayesian optimization technique, a gradient-based optimization technique, or an evolutionary optimization technique. 7. The device of claim 1 , wherein, the one or more processors are further to: test the first trained machine learning model to generate first test results; test the second trained machine learning model to generate second test results; compare the first test results and the second test results; and determine whether the second trained machine learning model is optimized based on a result of comparing the first test results and the second test results. 8. A non-transitory computer-readable medium storing instructions, the instructions comprising: one or more instructions that, when executed by one or more processors, cause the one or more processors to: receive images of a video stream, three-dimensional models for objects in the images, and physical property data for the objects, the images of the video stream including metadata that identifies at least two of: the images of the video stream, the objects in the images, classes associated with the objects, boundary boxes for the images, coordinates associated with the objects in the images, or names of the objects, the three-dimensional models including at least two of: three-dimensional representations of the objects, three-dimensional coordinates associated with the objects, normal vectors associated with the objects, or the names of the objects, the physical property data including at least two of: the names of the objects, information associated with deformations of the objects, information associated with gravities for the objects, information associated with rotations of the objects, information associated with renderings of the objects, or information associated with collisions of the objects; map the three-dimensional models and the physical property data to the objects in the images to generate augmented data sequences with the objects; apply different physical properties, of the physical property data, to the objects in the augmented data sequences to generate augmented data sequences with different applied physical properties; train a machine learning model based on the images of the video stream to generate a first machine learning model; train the machine learning model, based on the augmented data sequences with the different applied physical properties, to generate a second machine learning model; test the first machine learning model and the second machine learning model to generate first test results and second test results, respectively; determine whether the second machine learning model is optimized based on comparing the first test results and the second test results; and utilize the second machine learning model and the different applied physical properties, when the second machine learning model is optimized, to make a prediction. 9. The non-transitory computer-readable medium of claim 8 , wherein the instructions further comprise: one or more instructions that, when executed by the one or more processors, cause the one or more processors to: provide the second machine learning model and the different applied physical properties when the second machine learning model is optimized. 10. The non-transitory computer-readable medium of claim 8 , wherein the instructions further comprise: one or more instructions that, when executed by the one or more processors, cause the one or more processors to: modify the different applied physical properties when the second machine learning model is not optimized; retrain the machine learning model, based on the modified different applied physical properties, to generate the second machine learning model; retest the second machine learning model to generate the second test results; and repeat the modifying the different applied physical properties, the retraining, and the retesting until the second machine learning model is optimized. 11. The non-transitory computer-readable medium of claim 8 , wherein the different applied physical properties are configurable. 12. The non-transitory computer-readable medium of claim 8 , wherein each of the first machine learning model and second machine learning model includes one or more of: a single shot multibox detector (SSD) model, a region-based fully convolutional network (R-FCN) model, a region-based convolution network (R-CNN) model, a fast R-CNN model, or a faster R-CNN model. 13. The non-transitory
for test results analysis · CPC title
Machine learning · CPC title
Evolutionary algorithms, e.g. genetic algorithms or genetic programming · CPC title
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.