Generating and modifying digital images using a joint feature style latent space of a generative neural network
US-2023316606-A1 · Oct 5, 2023 · US
US12260485B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12260485-B2 |
| Application number | US-202218046077-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 12, 2022 |
| Priority date | Oct 12, 2022 |
| Publication date | Mar 25, 2025 |
| Grant date | Mar 25, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of generating a style image is described. The method includes receiving an input image of a subject. The method further includes encoding the input image using a first encoder of a generative adversarial network (GAN) to obtain a first latent code. The method further includes decoding the first latent code using a first decoder of the GAN to obtain a normalized style image of the subject, wherein the GAN is trained using a loss function according to semantic regions of the input image and the normalized style image.
Opening claim text (preview).
What is claimed is: 1. A method of generating a style image, the method comprising: receiving an input image of a subject; encoding the input image using a first encoder of a generative adversarial network (GAN) to obtain a first latent code; decoding the first latent code using a first decoder of the GAN to obtain a normalized style image of the subject, wherein: the GAN is trained using a loss function according to semantic regions of the input image and the normalized style image, and a distribution prior of a W+ space is modeled for training the GAN by inverting a dataset of real face images using a second encoder that is pre-trained. 2. The method of claim 1 , further comprising training the GAN by inverting the dataset of real face images to obtain a plurality of latent codes. 3. The method of claim 1 , the second encoder is different from the first decoder. 4. The method of claim 1 , wherein the second encoder is a pre-trained StyleGAN encoder. 5. The method of claim 1 , wherein training the GAN further comprises performing a W+ space transfer learning from the second encoder to the first encoder. 6. The method of claim 5 , wherein performing the W+ space transfer learning comprises using a normalized exemplar set with only neutral expressions of the subject. 7. The method of claim 5 , wherein performing the W+ space transfer learning comprises using a normalized exemplar set with only neutral poses of the subject. 8. The method of claim 5 , wherein performing the W+ space transfer learning comprises using a normalized exemplar set with only neutral lighting of the subject. 9. The method of claim 1 , further comprising training the GAN using a difference between a first face segmentation model trained using real face images and a second face segmentation model using style exemplars as the loss function. 10. The method of claim 9 , wherein the semantic regions include one or more of hair regions of the subject or skin regions of the subject. 11. A system for generating a style image, the system comprising: a processor; and memory storing instructions that, when executed by the processor, cause the system to perform a set of operations, the set of operations comprising: receiving an input image of a subject; encoding the input image using a first encoder of a generative adversarial network (GAN) to obtain a first latent code; decoding the first latent code using a first decoder of the GAN to obtain a normalized style image of the subject, wherein: the GAN is trained using a loss function according to semantic regions of the input image and the normalized style image, and a distribution prior of a W+ space is modeled for training the GAN by inverting a dataset of real face images using a second encoder that is pre-trained. 12. The method of claim 11 , wherein the set of operations further comprise training the GAN by inverting the dataset of real face images to obtain a plurality of latent codes. 13. The method of claim 11 , wherein the second encoder is different from the first decoder. 14. The method of claim 11 , wherein the second encoder is a pre-trained StyleGAN encoder. 15. The method of claim 11 , wherein the set of operations further comprise performing a W+ space transfer learning from the second encoder to the first encoder. 16. The method of claim 15 , wherein the set of operations further comprise using a normalized exemplar set with only neutral expressions of the subject. 17. The method of claim 15 , wherein the set of operations further comprise using a normalized exemplar set with only neutral poses of the subject. 18. The method of claim 15 , wherein the set of operations further comprise using a normalized exemplar set with only neutral lighting of the subject. 19. The method of claim 11 , wherein the set of operations further comprise training the GAN using a difference between a first face segmentation model trained using real face images and a second face segmentation model using style exemplars as the loss function. 20. The method of claim 19 , wherein the semantic regions include one or more of hair regions of the subject or skin regions of the subject.
Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title
Face · CPC title
Training; Learning · CPC title
Artificial neural networks [ANN] · CPC title
Region-based segmentation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.