Optimizing Supervised Generative Adversarial Networks via Latent Space Regularizations
US-2020349393-A1 · Nov 5, 2020 · US
US11869170B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11869170-B2 |
| Application number | US-201917293754-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 18, 2019 |
| Priority date | Nov 16, 2018 |
| Publication date | Jan 9, 2024 |
| Grant date | Jan 9, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes receiving a training image and a ground truth super-resolution image; processing a first training network input comprising the training image using the neural network to generate a first training super-resolution image; processing a first critic input generated from (i) the training image and (ii) the ground truth super-resolution image using a critic neural network to map the first critic input to a latent representation; processing a second critic input generated from (i) the training image and (ii) the first training super-resolution image using the critic neural network to map the second critic input to a latent representation; determining a gradient of a generator loss function that measures a distance between the latent representations of the critic inputs; and determining an update to the parameters.
Opening claim text (preview).
What is claimed is: 1. A method of training a super-resolution neural network having a plurality of super-resolution parameters, wherein the super-resolution neural network is configured to receive a network input comprising an input image having a first resolution and to process the network input to generate a super-resolution output image that is a version of the input image with a second, higher resolution, wherein the network input includes a vector representing a desired location in a space of possible super-resolution output images, and the method comprising: receiving a training input image and a ground truth super-resolution output image for the training image; processing a first training network input comprising the training input image using the super-resolution neural network and in accordance with current values of the super-resolution parameters to generate a first training super-resolution output image, wherein the first training network input includes a first vector representing a location of the ground truth super-resolution output image in the space of possible super-resolution output images; processing a first critic input generated from (i) the training input image and (ii) the ground truth super-resolution output image using a critic neural network having a plurality of critic parameters and in accordance with current values of the critic parameters, wherein the critic neural network is configured to receive the first critic input and to process the first critic input in accordance with the current values of the critic parameters to: map the first critic input to a latent representation of the first critic input in a perceptual latent space; processing a second critic input generated from (i) the training input image and (ii) the first training super-resolution output image generated by the super-resolution neural network using the critic neural network and in accordance with the current values of the critic parameters to map the second critic input to a latent representation of the second critic input in the perceptual latent space; determining a gradient with respect to the super-resolution parameters of a generator loss function that includes a perceptual loss that measures a distance between the latent representation of the first critic input and the latent representation of the second critic input; and determining, from the gradient, an update to the current values of the super-resolution parameters. 2. The method of claim 1 , wherein the first network input includes a vector of zeroes. 3. The method of claim 1 , wherein the first network input includes an embedding of the ground truth super-resolution output image. 4. The method of claim 3 , wherein the embedding is generated from the ground truth super-resolution output image by an embedding neural network. 5. The method of claim 1 , wherein the critic neural network is further configured to map each latent representation of each critic input to a critic score that represents a perceptual similarity between the images used to generate the critic input. 6. The method of claim 5 , wherein processing the first critic input generated from (i) the training input image and (ii) the ground truth super-resolution output image using a critic neural network further comprises generating, from the latent representation of the first critic input, a first critic score that represents a perceptual similarity between the (i) the training input image and (ii) the ground truth super-resolution output image. 7. The method of claim 6 , further comprising: processing a second training network input comprising the training input image using the super-resolution neural network and in accordance with current values of the super-resolution parameters to generate a second training super-resolution output image; and processing a third critic input generated from (i) the training input image and (ii) the second training super-resolution output image generated by the super-resolution neural network using the critic neural network and in accordance with the current values of the critic parameters to generate a latent representation of the third critic input in the perceptual latent space and a third critic score that represents a perceptual similarity between the training input image and the second training super-resolution output image; wherein the generator loss function also includes a first Generative Adversarial Network (GAN) loss that encourages increased perceptual similarity between the training input image and the second training super-resolution output image. 8. The method of claim 7 , wherein the second training network input comprises a random vector representing a random location in the space of possible super-resolution output images. 9. The method of claim 7 , further comprising: determining a gradient with respect to the critic parameters of a critic loss function that includes a second GAN loss to encourage decreased perceptual similarity between the training input image and the second training super-resolution output image while encouraging increased perceptual similarity between the training input image and the ground truth super-resolution output image; and determining, from the gradient, an update to the current values of the critic parameters. 10. The method of claim 1 , wherein each critic input includes a super-resolution image and an absolute pixel-wise distance between (i) a low-resolution image and (ii) a downscaled version of the super-resolution image. 11. The method of claim 10 , wherein the first critic input includes (i) the ground truth super-resolution output image and (ii) a zero image, and wherein the second critic input includes (iii) the first training super-resolution output image generated by the super-resolution neural network and (iv) an absolute pixel-wise distance between (v) the training input image and (vi) a downscaled version of the first training super-resolution image. 12. A system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations for training a super-resolution neural network having a plurality of super-resolution parameters, wherein the super-resolution neural network is configured to receive a network input comprising an input image having a first resolution and to process the network input to generate a super-resolution output image that is a version of the input image with a second, higher resolution, wherein the network input includes a vector representing a desired location in a space of possible super-resolution output images, and the operations comprising: receiving a training input image and a ground truth super-resolution output image for the training image; processing a first training network input comprising the training input image using the super-resolution neural network and in accordance with current values of the super-resolution parameters to generate a first training super-resolution output image, wherein the first training network input includes a first vector representing a location of the ground truth super-resolution output image in the space of possible super-resolution output images; processing a first critic input generated from (i) the training input image and (ii) the ground truth super-resolution output image using a critic neural network having a plurality of critic parameters and in accordance with current values of the critic parameters, wherein the critic neural network is configured to receive the first critic input and to process the first critic input in accordance with the current values of the critic parame
Adversarial learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Supervised learning · CPC title
Generative networks · CPC title
using neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.