Training neural networks using data augmentation policies
US-2021097348-A1 · Apr 1, 2021 · US
US11301733B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11301733-B2 |
| Application number | US-201916416848-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 20, 2019 |
| Priority date | May 18, 2018 |
| Publication date | Apr 12, 2022 |
| Grant date | Apr 12, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Example aspects of the present disclosure are directed to systems and methods for learning data augmentation strategies for improved object detection model performance. In particular, example aspects of the present disclosure are directed to iterative reinforcement learning approaches in which, at each of a plurality of iterations, a controller model selects a series of one or more augmentation operations to be applied to training images to generate augmented images. For example, the controller model can select the augmentation operations from a defined search space of available operations which can, for example, include operations that augment the training image without modification of the locations of a target object and corresponding bounding shape within the image and/or operations that do modify the locations of the target object and bounding shape within the training image.
Opening claim text (preview).
What is claimed is: 1. A computing system comprising: one or more processors; a controller model comprising a first neural network configured to select augmentation operations; and one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising: accessing a training dataset that comprises a plurality of training images, wherein one or more training images of the plurality of training images has been annotated with a bounding shape, and wherein a location of the bounding shape for the one or more training images corresponds to a location of a target object within the one or more training images; and for each of a plurality of iterations: selecting, by the first neural network of the controller model, a series of one or more augmentation operations; performing the series of one or more augmentation operations on each of the one or more training images to generate one or more augmented images; training a machine-learned object detection model based at least in part on the one or more augmented images, wherein the machine-learned object detection model comprises a second neural network configured to detect objects in imagery, and wherein the second neural network is separate from the first neural network; after training the machine-learned object detection model, evaluating one or more performance characteristics of the machine-learned object detection model; evaluating a reward function based at least in part on the one or more performance characteristics; and modifying one or more parameters of the first neural network based on the reward function. 2. The computing system of claim 1 , wherein selecting, by the first neural network of the controller model, the series of one or more augmentation operations comprises selecting, by the first neural network of the controller model, the series of one or more augmentation operations from a defined search space that includes a plurality of available augmentation operations. 3. The computing system of claim 2 , wherein the plurality of available augmentation operations comprise one or more color operations that modify color channel data associated with at least a portion of the one or more training images. 4. The computing system of claim 3 , wherein the one or more color operations comprise one or more of: an auto contrast operation; an equalize operation; a solarize operation; a posterize operation; a contrast operation; a color balance operation; a brightness operation; a sharpness operation; and a cutout operation. 5. The computing system of claim 2 , wherein the plurality of available augmentation operations comprise one or more geometric distortion operations that geometrically distort at least a portion of the one or more training images. 6. The computing system of claim 5 , wherein the one or more geometric distortion operations comprise one or more of: a shear operation; a translate operation; a rotate operation; a flipping operation; and a shift bounding shape operation. 7. The computing system of claim 2 , wherein the plurality of available augmentation operations comprise: one or more operations that augment the one or more training images without modification of the location of the bounding shape or the location of the target object within the one or more training images; and one or more operations that modify the location of the bounding shape and the location of the target object within the one or more training images. 8. The computing system of claim 1 , wherein: selecting, by the first neural network of the controller model, the series of one or more augmentation operations comprises selecting, by the first neural network of the controller model, a respective probability of performance for each of the one or more augmentation operations; and performing the series of one or more augmentation operations comprises performing each of the one or more augmentation operations with probability equal to the respective probability of performance. 9. The computing system of claim 1 , wherein: selecting, by the first neural network of the controller model, the series of one or more augmentation operations comprises selecting, by the first neural network of the controller model, a respective probability that each of the one or more augmentation operations will be applied only to the bounding shape of the one or more training images; and performing the series of one or more augmentation operations comprises applying each augmentation operation to only the bounding shape of the one or more training images with probability equal to the respective probability. 10. The computing system of claim 1 , wherein: selecting, by the first neural network of the controller model, the series of one or more augmentation operations comprises selecting, by the first neural network of the controller model, a respective augmentation magnitude for at least one of the augmentation operations; and performing the series of one or more augmentation operations comprises performing the at least one of the augmentation operations according to the respective augmentation magnitude. 11. The computing system of claim 10 , wherein selecting, by the first neural network of the controller model, the respective augmentation magnitude for at least one of the augmentation operations comprises selecting, by the first neural network of the controller model, the respective augmentation magnitude for at least one of the augmentation operations from a respective set of discrete and operation-specific available magnitudes, wherein the set of discrete and operation-specific available magnitudes comprise user-selected hyperparameters. 12. The computing system of claim 1 , wherein modifying the one or more parameters of the first neural network based on the reward function comprises backpropagating the reward function through the first neural network to modify the one or more parameters of the first neural network. 13. The computing system of claim 1 , wherein the controller model is configured to select the series of one or more augmentation operations through performance of evolutionary mutations, and wherein the operations further comprise, for each of the plurality of iterations, determining whether to retain or discard the series of one or more augmentation operations based at least in part on the one or more performance characteristics. 14. The computing system of claim 1 , wherein training the machine-learned object detection model based at least in part on the one or more augmented images comprises: evaluating, for each augmented image, a loss function that compares a predicted location for the bounding shape of the augmented image that was predicted by the machine-learned object detection model based on the augmented image to a ground truth location for the bounding shape; and backpropagating the loss function through the machine-learned object detection model. 15. The computing system of claim 1 , wherein, for each iteration, a number of augmentation operations in the series of one or more augmentation operations is selected by the controller model. 16. The computing system of claim 1 , wherein, for each iteration, a number of augmentation operations in the series of one or more augmentation operations is a user-selected hyperparameter. 17. The computing system of claim 1 , wherein the first neural network of the controller model comprises a recurrent neural network.
Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries · CPC title
Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title
Validation; Performance evaluation; Active pattern learning techniques · CPC title
Classification techniques · CPC title
Linear translation of whole images or parts thereof, e.g. panning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.