System for Generating Image, and Non-Transitory Computer-Readable Medium
US-2022415024-A1 · Dec 29, 2022 · US
US12299844B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12299844-B2 |
| Application number | US-202418440248-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 13, 2024 |
| Priority date | Mar 12, 2021 |
| Publication date | May 13, 2025 |
| Grant date | May 13, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately, efficiently, and flexibly generating harmonized digital images utilizing a self-supervised image harmonization neural network. In particular, the disclosed systems can implement, and learn parameters for, a self-supervised image harmonization neural network to extract content from one digital image (disentangled from its appearance) and appearance from another from another digital image (disentangled from its content). For example, the disclosed systems can utilize a dual data augmentation method to generate diverse triplets for parameter learning (including input digital images, reference digital images, and pseudo ground truth digital images), via cropping a digital image with perturbations using three-dimensional color lookup tables (“LUTs”). Additionally, the disclosed systems can utilize the self-supervised image harmonization neural network to generate harmonized digital images that depict content from one digital image having the appearance of another digital image.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: generating, from a digital image, a plurality of digital image crops utilizing dual data augmentation to augment one or more of content or appearance of the plurality of digital image crops, wherein generating the plurality of digital image crops comprises one or more of: augmenting an appearance of a first digital image crop by modifying one or more appearance characteristics of the first digital image crop; or augmenting an appearance of a second digital image crop by modifying one or more appearance characteristics of the second digital image crop; selecting, from among the plurality of digital image crops, one or more pairs of digital image crops comprising content crops and appearance crops; learning, from the one or more pairs of digital image crops, one or more parameters for relative to a neural network appearance encoder such that the neural network appearance encoder disentangles appearance from content by generating appearance codes comprising latent features that represent appearance characteristics; learning, from the one or more pairs of digital image crops, one or more parameters relative to a neural network content encoder such that the neural network content encoder disentangles content from appearance by generating content codes comprising latent features that represent content; and learning one or more parameters of a neural network decoder that generates a modified image by decoding a concatenation of a content code and an appearance code. 2. The computer-implemented method of claim 1 , wherein generating the plurality of digital image crops comprises utilizing the dual data augmentation to: generate, from the digital image, a first digital image crop and a second digital image crop, the second digital image crop depicting one or more different pixels than the first digital image crop; augment an appearance of the first digital image crop by modifying one or more appearance characteristics of the first digital image crop; and augment an appearance of the second digital image crop by modifying one or more appearance characteristics of the second digital image crop differently from the one or more appearance characteristics of the first digital image crop. 3. The computer-implemented method of claim 1 , wherein generating the plurality of digital image crops comprises utilizing the dual data augmentation to: generate, from the digital image, a first appearance-augmented digital image by performing a first modification to one or more appearance characteristics of the digital image; generate, from the digital image, a second appearance-augmented digital image different from the first appearance-augmented digital image by performing a second modification to one or more appearance characteristics of the digital image; and crop the first appearance-augmented digital image and the second appearance-augmented digital image such that the first appearance-augmented digital image and the second appearance-augmented digital image depict at least some overlapping portion. 4. The computer-implemented method of claim 1 , wherein selecting the one or more pairs of digital image crops comprises: selecting, from among the plurality of digital image crops, a content crop comprising a cropped portion of the digital image for input into the neural network content encoder; and selecting, from among the plurality of digital image crops, an appearance crop comprising an appearance-augmented portion of the digital image for input into the neural network appearance encoder. 5. The computer-implemented method of claim 1 , further comprising: extracting a first content code from a content crop of the plurality of digital image crops utilizing the neural network content encoder; extracting a first appearance code from an appearance crop of the plurality of digital image crops utilizing the neural network appearance encoder; and generating a modified digital image from the content crop and the appearance crop by combining the first content code and the first appearance code utilizing the neural network decoder. 6. The computer-implemented method of claim 5 , wherein learning the one or more parameters for the neural network appearance encoder comprises: generating, by cropping and augmenting an appearance of a portion of the digital image, a pseudo ground truth crop comprising a content corresponding to the first content code of the modified digital image and an appearance corresponding to the first appearance code of the modified digital image; comparing the modified digital image with the pseudo ground truth crop utilizing a harmonization loss function; and modifying the one or more parameters for the neural network appearance encoder to reduce a measure of loss associated with the harmonization loss function. 7. The computer-implemented method of claim 5 , wherein learning the one or more parameters for the neural network content encoder comprises: generating, by cropping and augmenting an appearance of a portion of the digital image, a pseudo ground truth crop comprising a content corresponding to the first content code of the modified digital image and an appearance corresponding to the first appearance code of the modified digital image; comparing the modified digital image with the pseudo ground truth crop utilizing a reconstruction loss function; and modifying the one or more parameters for the neural network content encoder to reduce a measure of loss associated with the reconstruction loss function. 8. The computer-implemented method of claim 1 , wherein generating the plurality of digital image crops comprises utilizing the dual data augmentation to augment appearance utilizing a three-dimensional lookup table to modify colors of the digital image. 9. A system comprising: a memory component comprising a digital image; and a processing device coupled to the memory component, the processing device to perform operations comprising: generating, from a digital image, a plurality of digital image crops utilizing dual data augmentation to augment one or more of content or appearance of the plurality of digital image crops, generating the plurality of digital image crops by comprises one or more of: augmenting an appearance of a first digital image crop by modifying one or more appearance characteristics of the first digital image crop; or augmenting an appearance of a second digital image crop by modifying one or more appearance characteristics of the second digital image crop; selecting, from among the plurality of digital image crops, one or more pairs of digital image crops comprising content crops and appearance crops; learning, from the one or more pairs of digital image crops, one or more parameters relative to a neural network appearance encoder such that the neural network appearance encoder disentangles appearance from content by generating an appearance code comprising latent features that represent appearance characteristics; learning, from the one or more pairs of digital image crops, one or more parameters relative to a neural network content encoder such that the neural network content encoder disentangles content from appearance by generating a content code comprising latent features that represent content; and learning one or more parameters of a neural network decoder that generates a modified image by decoding a concatenation of a content code and an appearance code. 10. The system of claim 9 , wherein generating the plurality of digital image crops comprises utilizing the dual data augmentation to: generate, from the digital image, a first digital image crop and a second digital image crop, the sec
Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.