Generating a machine learning model for objects based on augmenting the objects with physical properties

US10565475B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10565475-B2
Application numberUS-201815961392-A
CountryUS
Kind codeB2
Filing dateApr 24, 2018
Priority dateApr 24, 2018
Publication dateFeb 18, 2020
Grant dateFeb 18, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A device receives images of a video stream, models for objects in the images, and physical property data for the objects, and maps the models and the physical property data to the objects in the images to generate augmented data sequences. The device applies different physical properties to the objects in the augmented data sequences to generate augmented data sequences with different applied physical properties, and trains a machine learning (ML) model based on the images to generate a first trained ML model. The device trains the ML model, based on the augmented data sequences with the different applied physical properties, to generate a second trained ML model, and compares the first trained ML model and the second trained ML model. The device determines whether the second trained ML model is optimized based on the comparison, and provides the second trained ML model when optimized.

First claim

Opening claim text (preview).

What is claimed is: 1. A device, comprising: one or more memories; and one or more processors, communicatively coupled to the one or more memories, to: receive images of a video stream, three-dimensional models for objects in the images, and physical property data for the objects; map the three-dimensional models and the physical property data to the objects in the images to generate augmented data sequences with the objects; apply different physical properties, of the physical property data, to the objects in the augmented data sequences, based on an augmentation policy, to generate augmented data sequences with different applied physical properties; train a machine learning model based on the images of the video stream to generate a first trained machine learning model; train the machine learning model, based on the augmented data sequences with the different applied physical properties, to generate a second trained machine learning model; compare the first trained machine learning model and the second trained machine learning model; determine whether the second trained machine learning model is optimized based on a result of comparing the first trained machine learning model and the second trained machine learning model; and provide the second trained machine learning model and the different applied physical properties when the second trained machine learning model is optimized. 2. The device of claim 1 , wherein the one or more processors are further to: modify the different applied physical properties when the second trained machine learning model is not optimized; retrain the machine learning model, based on the modified different applied physical properties, to generate the second trained machine learning model; and repeat the modifying the different applied physical properties and the retraining until the second trained machine learning model is optimized. 3. The device of claim 1 , wherein the one or more processors are further to: utilize the second trained machine learning model and the different applied physical properties, when the second trained machine learning model is optimized, to predict an unknown object. 4. The device of claim 1 , wherein the one or more processors are further to: receive the machine learning model and the augmentation policy, wherein the augmentation policy includes information indicating how the different physical properties are to be applied to each of the augmented data sequences. 5. The device of claim 1 , wherein the machine learning model includes one or more of: a single shot multibox detector (SSD) model, a region-based fully convolutional network (R-FCN) model, a region-based convolution network (R-CNN) model, a fast R-CNN model, or a faster R-CNN model. 6. The device of claim 1 , wherein the one or more processors are further to: modify the different applied physical properties, when the second trained machine learning model is not optimized, based on a hyperparameter optimization technique, wherein the hyperparameter optimization technique includes one or more of: a grid search technique, a random search technique, a Bayesian optimization technique, a gradient-based optimization technique, or an evolutionary optimization technique. 7. The device of claim 1 , wherein, the one or more processors are further to: test the first trained machine learning model to generate first test results; test the second trained machine learning model to generate second test results; compare the first test results and the second test results; and determine whether the second trained machine learning model is optimized based on a result of comparing the first test results and the second test results. 8. A non-transitory computer-readable medium storing instructions, the instructions comprising: one or more instructions that, when executed by one or more processors, cause the one or more processors to: receive images of a video stream, three-dimensional models for objects in the images, and physical property data for the objects, the images of the video stream including metadata that identifies at least two of: the images of the video stream, the objects in the images, classes associated with the objects, boundary boxes for the images, coordinates associated with the objects in the images, or names of the objects, the three-dimensional models including at least two of: three-dimensional representations of the objects, three-dimensional coordinates associated with the objects, normal vectors associated with the objects, or the names of the objects, the physical property data including at least two of: the names of the objects, information associated with deformations of the objects, information associated with gravities for the objects, information associated with rotations of the objects, information associated with renderings of the objects, or information associated with collisions of the objects; map the three-dimensional models and the physical property data to the objects in the images to generate augmented data sequences with the objects; apply different physical properties, of the physical property data, to the objects in the augmented data sequences to generate augmented data sequences with different applied physical properties; train a machine learning model based on the images of the video stream to generate a first machine learning model; train the machine learning model, based on the augmented data sequences with the different applied physical properties, to generate a second machine learning model; test the first machine learning model and the second machine learning model to generate first test results and second test results, respectively; determine whether the second machine learning model is optimized based on comparing the first test results and the second test results; and utilize the second machine learning model and the different applied physical properties, when the second machine learning model is optimized, to make a prediction. 9. The non-transitory computer-readable medium of claim 8 , wherein the instructions further comprise: one or more instructions that, when executed by the one or more processors, cause the one or more processors to: provide the second machine learning model and the different applied physical properties when the second machine learning model is optimized. 10. The non-transitory computer-readable medium of claim 8 , wherein the instructions further comprise: one or more instructions that, when executed by the one or more processors, cause the one or more processors to: modify the different applied physical properties when the second machine learning model is not optimized; retrain the machine learning model, based on the modified different applied physical properties, to generate the second machine learning model; retest the second machine learning model to generate the second test results; and repeat the modifying the different applied physical properties, the retraining, and the retesting until the second machine learning model is optimized. 11. The non-transitory computer-readable medium of claim 8 , wherein the different applied physical properties are configurable. 12. The non-transitory computer-readable medium of claim 8 , wherein each of the first machine learning model and second machine learning model includes one or more of: a single shot multibox detector (SSD) model, a region-based fully convolutional network (R-FCN) model, a region-based convolution network (R-CNN) model, a fast R-CNN model, or a faster R-CNN model. 13. The non-transitory

Assignees

Inventors

Classifications

  • for test results analysis · CPC title

  • Machine learning · CPC title

  • G06N3/126Primary

    Evolutionary algorithms, e.g. genetic algorithms or genetic programming · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10565475B2 cover?
A device receives images of a video stream, models for objects in the images, and physical property data for the objects, and maps the models and the physical property data to the objects in the images to generate augmented data sequences. The device applies different physical properties to the objects in the augmented data sequences to generate augmented data sequences with different applied p…
Who is the assignee on this patent?
Accenture Global Solutions Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/126. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 18 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).