Inserting three-dimensional objects into digital images with consistent lighting via global and local lighting information
US-2023037591-A1 · Feb 9, 2023 · US
US2025328987A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025328987-A1 |
| Application number | US-202418640429-A |
| Country | US |
| Kind code | A1 |
| Filing date | Apr 19, 2024 |
| Priority date | Apr 19, 2024 |
| Publication date | Oct 23, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and non-transitory computer readable storage media are disclosed for generating digital images with a diffusion-based generative neural network conditioned on background-extracted lighting features. The disclosed system determines, in response to a request to generate a digital image, a target background image for inserting a foreground object into the target background image. The disclosed system generates, from the target background image and utilizing a lighting conditioning neural network, a lighting feature representation indicating one or more lighting parameters of the target background image. Additionally, the disclosed system generates, utilizing a diffusion-based generative neural network conditioned on the lighting feature representation, the digital image including the foreground object inserted into the target background image based on a composite image comprising the foreground object and the target background image with a foreground mask corresponding to the foreground object.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method comprising: determining, in response to a request to generate a digital image, a target background image for inserting a foreground object into the target background image; generating, from the target background image and utilizing a lighting conditioning neural network, a lighting feature representation indicating one or more lighting parameters of the target background image; and generating, utilizing a diffusion-based generative neural network conditioned on the lighting feature representation, the digital image including the foreground object inserted into the target background image based on a composite image comprising the foreground object and the target background image with a foreground mask corresponding to the foreground object. 2 . The computer-implemented method of claim 1 , wherein generating the lighting feature representation comprises extracting the one or more lighting parameters from the target background image to an encoding space utilizing the lighting conditioning neural network. 3 . The computer-implemented method of claim 1 , wherein generating the digital image comprises injecting the lighting feature representation into the diffusion-based generative neural network by providing conditional feature maps corresponding to the lighting feature representation to a plurality of diffusion decoders of the diffusion-based generative neural network. 4 . The computer-implemented method of claim 1 , further comprising: determining the target background image from a training tuple comprising a foreground image including the foreground object and the foreground mask, the target background image, and an environment map of the target background image; and jointly modifying parameters of the lighting conditioning neural network and the diffusion-based generative neural network to reduce an output of a loss function based on a noise input and the digital image generated utilizing the diffusion-based generative neural network according to the training tuple. 5 . The computer-implemented method of claim 4 , further comprising: generating, utilizing an environment lighting conditioning neural network, an environment lighting feature representation indicating one or more lighting parameters of the environment map of the target background image; and modifying the parameters of the lighting conditioning neural network by comparing the lighting feature representation to the environment lighting feature representation. 6 . The computer-implemented method of claim 5 , wherein: generating the environment lighting feature representation utilizing the environment lighting conditioning neural network with the diffusion-based generative neural network; freezing the parameters of the environment lighting conditioning neural network and the diffusion-based generative neural network; and modifying the parameters of the lighting conditioning neural network and parameters of a representation alignment neural network layer between the lighting conditioning neural network and the environment lighting conditioning neural network according to differences between the lighting feature representation and the environment lighting feature representation. 7 . The computer-implemented method of claim 1 , further comprising generating a synthesis training dataset for modifying the diffusion-based generative neural network by: extracting an object from a training image according to an object mask; generating a synthetic background image by inpainting the training image to remove the object from the training image; and generating a synthetic digital image comprising a modified version of the object inserted into an additional background image utilizing the diffusion-based generative neural network comprising parameters modified based on an environment lighting feature representation of an environment map of the additional background image. 8 . The computer-implemented method of claim 7 , further comprising modifying the diffusion-based generative neural network by: generating, utilizing the lighting conditioning neural network, an additional lighting feature representation from the synthetic background image; generating, utilizing the diffusion-based generative neural network conditioned on the additional lighting feature representation, an additional digital image including the object inserted into the synthetic background image based on the modified version of the object in the synthetic digital image; and modifying parameters of the diffusion-based generative neural network based on differences between the additional digital image and the training image. 9 . A system comprising: one or more memory devices; and one or more processors coupled to the one or more memory devices that cause the system to perform operations comprising: generating, utilizing an environment lighting conditioning neural network, an environment lighting feature representation from an environment map corresponding to a target background image; generating, utilizing a lighting conditioning neural network, a lighting feature representation from the target background image; and modifying parameters of the lighting conditioning neural network to reduce differences between the lighting feature representation and the environment lighting feature representation. 10 . The system of claim 9 , wherein generating the environment lighting feature representation comprises: determining a training tuple comprising a foreground image including a foreground object, the target background image, and the environment map corresponding to the target background image; and generating the environment lighting feature representation from the environment map utilizing the environment lighting conditioning neural network with frozen parameters in connection with generating a digital image utilizing a diffusion-based generative neural network conditioned on the environment lighting feature representation. 11 . The system of claim 10 , wherein generating the lighting feature representation comprises generating, utilizing the lighting conditioning neural network with modifiable parameters, the lighting feature representation from the target background image of the training tuple. 12 . The system of claim 11 , wherein modifying the parameters of the lighting conditioning neural network comprises: determining the differences between the lighting feature representation and the environment lighting feature representation utilizing an alignment neural network layer between the lighting conditioning neural network and the environment lighting conditioning neural network; and modifying the parameters of the lighting conditioning neural network and parameters of the alignment neural network layer to reduce the differences between the lighting feature representation and the environment lighting feature representation. 13 . The system of claim 10 , further comprising: generating, utilizing a diffusion-based generative neural network conditioned on the lighting feature representation, a digital image including the foreground object inserted into the target background image based on a composite image comprising the foreground object and the target background image with a foreground mask corresponding to the foreground object; and jointly modifying the parameters of the lighting conditioning neural network and parameters of the diffusion-based generative neural network to reduce an output of a loss function based on a noise input and the digital image. 14 . The system of claim 9 , further comprising modifying paramet
Training; Learning · CPC title
using two or more images, e.g. averaging or subtraction · CPC title
Retouching; Inpainting; Scratch removal · CPC title
using machine learning, e.g. neural networks · CPC title
relating to colour · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.