Generating modified digital images utilizing a global and spatial autoencoder
US-2021358177-A1 · Nov 18, 2021 · US
US12169907B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12169907-B2 |
| Application number | US-202117534631-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 24, 2021 |
| Priority date | Nov 24, 2021 |
| Publication date | Dec 17, 2024 |
| Grant date | Dec 17, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and systems for generating a texturized image are disclosed. Some examples may include: receiving an input image, receiving an exemplar texture image, generating, using an encoder, a first latent code vector representation based on the input image, generating, using a generative adversarial network generator, a second latent code vector representation based on the exemplar texture image, blending the first latent code vector representation and the second latent code vector representation to obtain a blended latent code vector representation, generating, by the GAN generator, a texturized image based on the blended latent code vector representation and providing the texturized image as an output image.
Opening claim text (preview).
What is claimed is: 1. A method for generating a texturized image, the method comprising: receiving a plurality of exemplar stylistic images; training a first generative adversarial network (GAN) generator using transfer learning based on the received plurality of exemplar stylistic images; receiving a plurality of training images; training an encoder using the plurality of training images and another second GAN generator, the encoder trained for inversion by learning a posterior distribution of a fixed pre-trained GAN model, and the encoder using the fixed pre-trained GAN model as a decoder; receiving an input image; receiving an exemplar texture image; generating, using the encoder, a first latent code vector representation based on the input image; generating, using the first GAN generator, a second latent code vector representation based on the exemplar texture image; blending the first latent code vector representation and the second latent code vector representation to obtain a blended latent code vector representation by concatenating a first predetermined amount of first sub-codes of the first latent code vector representation and a second predetermined amount of last sub-codes of the second latent code vector representation; generating, by the first GAN generator, a texturized image based on the blended latent code vector representation; and providing the texturized image as an output. 2. The method of claim 1 , wherein concatenating the first predetermined amount of first sub-codes of the first latent code vector representation and the second predetermined amount of last sub-codes of the second latent code vector representation comprises concatenating the first eight sub-codes of the first latent code vector representation and the last ten sub-codes of the second latent code vector representation. 3. The method of claim 1 , wherein the encoder is a hierarchical variational autoencoder. 4. The method of claim 3 , wherein the first latent code vector representation comprises 18×512 dimensions. 5. The method of claim 1 , wherein the first GAN generator is an AgileGAN generator. 6. The method of claim 1 , wherein training the encoder and training the first GAN generator are executed independently and in parallel. 7. A system, comprising: one or more hardware processors configured by machine-readable instructions to: receive a plurality of exemplar stylistic images; train a first generative adversarial network (GAN) generator using transfer learning based on the received plurality of exemplar stylistic images; receive a plurality of training images; train an encoder using the plurality of training images and second GAN generator, the encoder trained for inversion by learning a posterior distribution of a fixed pre-trained GAN model, and the encoder using the fixed pre-trained GAN model as a decoder; receive an input image; receive an exemplar texture image; generate, using the encoder, a first latent code vector representation based on the input image; generate, using the first GAN generator, a second latent code vector representation based on the exemplar texture image; blend the first latent code vector representation and the second latent code vector representation to obtain a blended latent code vector representation by concatenating a first predetermined amount of first sub-codes of the first latent code vector representation and a second predetermined amount of last sub-codes of the second latent code vector representation; generate, by the first GAN generator, a texturized image based on the blended latent code vector representation; and provide the texturized image as an output. 8. The system of claim 7 , wherein concatenating the first predetermined amount of first sub-codes of the first latent code vector representation and the second predetermined amount of last sub-codes of the second latent code vector representation comprises concatenating the first eight sub-codes of the first latent code vector representation and the last ten sub-codes of the second latent code vector representation. 9. The system of claim 7 , wherein the encoder is a hierarchical variational autoencoder. 10. The system of claim 9 , wherein the first latent code vector representation comprises 18×512 dimensions. 11. The system of claim 7 , wherein the first GAN generator is an AgileGAN generator. 12. The system of claim 7 , wherein training the encoder and training the first GAN generator are executed independently and in parallel. 13. A non-transitory computer-readable storage medium comprising instructions being executable by one or more processors to cause the one or more processors to: receive a plurality of exemplar stylistic images; train a first generative adversarial network (GAN) generator using transfer learning based on the received plurality of exemplar stylistic images; receive a plurality of training images; train an encoder using the plurality of training images and second GAN generator, the encoder trained for inversion by learning a posterior distribution of a fixed pre-trained GAN model, and the encoder using the fixed pre-trained GAN model as a decoder; receive an input image; receive an exemplar texture image; generate, using the encoder, a first latent code vector representation based on the input image; generate, using the first GAN generator, a second latent code vector representation based on the exemplar texture image; blend the first latent code vector representation and the second latent code vector representation to obtain a blended latent code vector representation by concatenating a first predetermined amount of first sub-codes of the first latent code vector representation and a second predetermined amount of last sub-codes of the second latent code vector representation; generate, by the first GAN generator, a texturized image based on the blended latent code vector representation; and provide the texturized image as an output. 14. The computer-readable storage medium of claim 13 , wherein concatenating the first predetermined amount of first sub-codes of the first latent code vector representation and the second predetermined amount of last sub-codes of the second latent code vector representation comprises concatenating the first eight sub-codes of the first latent code vector representation and the last ten sub-codes of the second latent code vector representation. 15. The computer-readable storage medium of claim 13 , wherein the encoder is a hierarchical variational autoencoder. 16. The computer-readable storage medium of claim 15 , wherein the first latent code vector representation comprises 18×512 dimensions. 17. The computer-readable storage medium of claim 13 , wherein the first GAN generator is an AgileGAN generator. 18. The computer-readable storage medium of claim 13 , wherein training the encoder and training the first GAN generator are executed independently and in parallel.
Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title
Context-preserving transformations, e.g. by using an importance map (panospheric to cylindrical image transformations G06T3/12) · CPC title
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
using neural networks · CPC title
Face · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.