What technology area does this patent fall under?

Primary CPC classification G06V10/772. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 12 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Learning data augmentation strategies for object detection

US11301733B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11301733-B2
Application number	US-201916416848-A
Country	US
Kind code	B2
Filing date	May 20, 2019
Priority date	May 18, 2018
Publication date	Apr 12, 2022
Grant date	Apr 12, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Example aspects of the present disclosure are directed to systems and methods for learning data augmentation strategies for improved object detection model performance. In particular, example aspects of the present disclosure are directed to iterative reinforcement learning approaches in which, at each of a plurality of iterations, a controller model selects a series of one or more augmentation operations to be applied to training images to generate augmented images. For example, the controller model can select the augmentation operations from a defined search space of available operations which can, for example, include operations that augment the training image without modification of the locations of a target object and corresponding bounding shape within the image and/or operations that do modify the locations of the target object and bounding shape within the training image.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing system comprising: one or more processors; a controller model comprising a first neural network configured to select augmentation operations; and one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising: accessing a training dataset that comprises a plurality of training images, wherein one or more training images of the plurality of training images has been annotated with a bounding shape, and wherein a location of the bounding shape for the one or more training images corresponds to a location of a target object within the one or more training images; and for each of a plurality of iterations: selecting, by the first neural network of the controller model, a series of one or more augmentation operations; performing the series of one or more augmentation operations on each of the one or more training images to generate one or more augmented images; training a machine-learned object detection model based at least in part on the one or more augmented images, wherein the machine-learned object detection model comprises a second neural network configured to detect objects in imagery, and wherein the second neural network is separate from the first neural network; after training the machine-learned object detection model, evaluating one or more performance characteristics of the machine-learned object detection model; evaluating a reward function based at least in part on the one or more performance characteristics; and modifying one or more parameters of the first neural network based on the reward function. 2. The computing system of claim 1 , wherein selecting, by the first neural network of the controller model, the series of one or more augmentation operations comprises selecting, by the first neural network of the controller model, the series of one or more augmentation operations from a defined search space that includes a plurality of available augmentation operations. 3. The computing system of claim 2 , wherein the plurality of available augmentation operations comprise one or more color operations that modify color channel data associated with at least a portion of the one or more training images. 4. The computing system of claim 3 , wherein the one or more color operations comprise one or more of: an auto contrast operation; an equalize operation; a solarize operation; a posterize operation; a contrast operation; a color balance operation; a brightness operation; a sharpness operation; and a cutout operation. 5. The computing system of claim 2 , wherein the plurality of available augmentation operations comprise one or more geometric distortion operations that geometrically distort at least a portion of the one or more training images. 6. The computing system of claim 5 , wherein the one or more geometric distortion operations comprise one or more of: a shear operation; a translate operation; a rotate operation; a flipping operation; and a shift bounding shape operation. 7. The computing system of claim 2 , wherein the plurality of available augmentation operations comprise: one or more operations that augment the one or more training images without modification of the location of the bounding shape or the location of the target object within the one or more training images; and one or more operations that modify the location of the bounding shape and the location of the target object within the one or more training images. 8. The computing system of claim 1 , wherein: selecting, by the first neural network of the controller model, the series of one or more augmentation operations comprises selecting, by the first neural network of the controller model, a respective probability of performance for each of the one or more augmentation operations; and performing the series of one or more augmentation operations comprises performing each of the one or more augmentation operations with probability equal to the respective probability of performance. 9. The computing system of claim 1 , wherein: selecting, by the first neural network of the controller model, the series of one or more augmentation operations comprises selecting, by the first neural network of the controller model, a respective probability that each of the one or more augmentation operations will be applied only to the bounding shape of the one or more training images; and performing the series of one or more augmentation operations comprises applying each augmentation operation to only the bounding shape of the one or more training images with probability equal to the respective probability. 10. The computing system of claim 1 , wherein: selecting, by the first neural network of the controller model, the series of one or more augmentation operations comprises selecting, by the first neural network of the controller model, a respective augmentation magnitude for at least one of the augmentation operations; and performing the series of one or more augmentation operations comprises performing the at least one of the augmentation operations according to the respective augmentation magnitude. 11. The computing system of claim 10 , wherein selecting, by the first neural network of the controller model, the respective augmentation magnitude for at least one of the augmentation operations comprises selecting, by the first neural network of the controller model, the respective augmentation magnitude for at least one of the augmentation operations from a respective set of discrete and operation-specific available magnitudes, wherein the set of discrete and operation-specific available magnitudes comprise user-selected hyperparameters. 12. The computing system of claim 1 , wherein modifying the one or more parameters of the first neural network based on the reward function comprises backpropagating the reward function through the first neural network to modify the one or more parameters of the first neural network. 13. The computing system of claim 1 , wherein the controller model is configured to select the series of one or more augmentation operations through performance of evolutionary mutations, and wherein the operations further comprise, for each of the plurality of iterations, determining whether to retain or discard the series of one or more augmentation operations based at least in part on the one or more performance characteristics. 14. The computing system of claim 1 , wherein training the machine-learned object detection model based at least in part on the one or more augmented images comprises: evaluating, for each augmented image, a loss function that compares a predicted location for the bounding shape of the augmented image that was predicted by the machine-learned object detection model based on the augmented image to a ground truth location for the bounding shape; and backpropagating the loss function through the machine-learned object detection model. 15. The computing system of claim 1 , wherein, for each iteration, a number of augmentation operations in the series of one or more augmentation operations is selected by the controller model. 16. The computing system of claim 1 , wherein, for each iteration, a number of augmentation operations in the series of one or more augmentation operations is a user-selected hyperparameter. 17. The computing system of claim 1 , wherein the first neural network of the controller model comprises a recurrent neural network.

Assignees

Google Llc

Inventors

Classifications

G06V10/772Primary
Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries · CPC title
G06T11/10
Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title
G06F18/217
Validation; Performance evaluation; Active pattern learning techniques · CPC title
G06F18/24
Classification techniques · CPC title
G06T3/20
Linear translation of whole images or parts thereof, e.g. panning · CPC title

Patent family

Related publications grouped by family.

View patent family 68533799

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11301733B2 cover?: Example aspects of the present disclosure are directed to systems and methods for learning data augmentation strategies for improved object detection model performance. In particular, example aspects of the present disclosure are directed to iterative reinforcement learning approaches in which, at each of a plurality of iterations, a controller model selects a series of one or more augmentation…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G06V10/772. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 12 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Training neural networks using data augmentation policies

Method and apparatus for recognizing image and method and apparatus for training recognition model based on data augmentation

Optical receipt processing

Targeted data augmentation using neural style transfer

Image analysis neural network systems

Frequently asked questions