What technology area does this patent fall under?

Primary CPC classification G06T5/50. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Generating deep harmonized digital images

US11935217B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11935217-B2
Application number	US-202117200338-A
Country	US
Kind code	B2
Filing date	Mar 12, 2021
Priority date	Mar 12, 2021
Publication date	Mar 19, 2024
Grant date	Mar 19, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately, efficiently, and flexibly generating harmonized digital images utilizing a self-supervised image harmonization neural network. In particular, the disclosed systems can implement, and learn parameters for, a self-supervised image harmonization neural network to extract content from one digital image (disentangled from its appearance) and appearance from another from another digital image (disentangled from its content). For example, the disclosed systems can utilize a dual data augmentation method to generate diverse triplets for parameter learning (including input digital images, reference digital images, and pseudo ground truth digital images), via cropping a digital image with perturbations using three-dimensional color lookup tables (“LUTs”). Additionally, the disclosed systems can utilize the self-supervised image harmonization neural network to generate harmonized digital images that depict content from one digital image having the appearance of another digital image.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: one or more memory devices comprising a self-supervised image harmonization neural network comprising: a neural network appearance encoder that extracts an appearance code by disentangling appearance features from content features of a first digital image, the appearance code comprising a latent vector representing one or more appearance characteristics of the first digital image, wherein disentangling the appearance features comprises excluding one or more content features of the first digital image from the latent vector as part of extracting the appearance code; a neural network content encoder that extracts a content code by disentangling content features from appearance features of a second digital image, the content code comprising a latent vector representing a spatial arrangement of the second digital image, wherein disentangling the content features comprises excluding one or more appearance features of the second digital image from the latent vector as part of extracting the content code; and a neural network decoder that generates a modified digital image from the appearance code and the content code. 2. The system of claim 1 , further comprising one or more computing devices that are configured to cause the system to extract the appearance code from the first digital image by utilizing the neural network appearance encoder to extract features representing one or more of color, contrast, brightness, or saturation of the first digital image. 3. The system of claim 1 , further comprising one or more computing devices that are configured to cause the system to extract the content code from the second digital image by utilizing the neural network content encoder to extract features representing a spatial arrangement defining positions and shapes of objects in the second digital image. 4. The system of claim 1 , further comprising one or more computing devices that are configured to cause the system to generate a harmonized digital image by combining a portion of the modified digital image with the first digital image such that the portion of the modified digital image comprises a foreground of the harmonized digital image and the first digital image comprises a background of the harmonized digital image. 5. The system of claim 4 , wherein the one or more computing devices are further configured to generate the harmonized digital image by: receiving an indication of user interaction to generate a mask defining the portion of the modified digital image to combine with the first digital image; selecting the portion of the modified digital image indicated by the mask; and combining the portion of the modified digital image with the first digital image utilizing a fitting function to adapt resolutions. 6. The system of claim 1 , further comprising one or more computing devices that are configured to cause the system to receive an indication of user interaction selecting the first digital image and the second digital image to combine together to generate the modified digital image. 7. A non-transitory computer readable medium comprising instructions that, when executed by at least one processor, cause a computing device to: extract, from a reference digital image, an appearance code by disentangling appearance features from content features of the reference digital image utilizing a neural network appearance encoder, the appearance code comprising a latent vector representing one or more appearance characteristics of the reference digital image, wherein disentangling the appearance features comprises excluding one or more content features of the reference digital image from the latent vector as part of extracting the appearance code; extract, from an input digital image, a content code by disentangling content features from appearance features of the input digital image utilizing a neural network content encoder, the content code comprising a latent vector representing a spatial arrangement of the input digital image, wherein disentangling the content features comprises excluding one or more appearance features of the input digital image from the latent vector as part of extracting the content code; generate a modified digital image from the appearance code and the content code utilizing a neural network decoder, the modified digital image comprising the one or more appearance characteristics of the reference digital image and the spatial arrangement of the input digital image; and generate a harmonized digital image by combining a portion of the modified digital image with the reference digital image. 8. The non-transitory computer readable medium of claim 7 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the harmonized digital image by utilizing a fitting function learned from low-resolution digital images to combine the portion of the modified digital image with the reference digital image in a high resolution. 9. The non-transitory computer readable medium of claim 7 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the harmonized digital image by receiving a mask indicating the portion of the modified digital image to combine with the reference digital image. 10. The non-transitory computer readable medium of claim 7 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the harmonized digital image in response to receiving indications of user selections of the reference digital image and the input digital image to combine together. 11. The non-transitory computer readable medium of claim 7 , further comprising instructions that, when executed by the at least one processor, cause the computing device to extract the content code by utilizing the neural network content encoder to extract features representing a spatial arrangement defining positions and shapes of objects in the input digital image. 12. The non-transitory computer readable medium of claim 7 , further comprising instructions that, when executed by the at least one processor, cause the computing device to extract the appearance code by utilizing the neural network appearance encoder to extract features representing color of the reference digital image without representing texture. 13. A computer-implemented method comprising: extracting, utilizing a neural network appearance encoder, an appearance code by disentangling appearance features from content features of a first digital image, the appearance code comprising a latent vector representing one or more appearance characteristics of the first digital image, wherein disentangling the appearance features comprises excluding one or more content features of the first digital image from the latent vector as part of extracting the appearance code; extracting, utilizing a neural network content encoder, a content code by disentangling content features from appearance features of a second digital image, the content code comprising a latent vector representing a spatial arrangement of the second digital image, wherein disentangling the content features comprises excluding one or more appearance features of the second digital image from the latent vector as part of extracting the content code; and generating, utilizing a neural network decoder, a modified digital image from the appearance code and the content code. 14. The computer-implemented method of claim 13 , further comprising extracting the appearance code from the first digital image by utilizing the neural net

Assignees

Adobe Inc

Inventors

Classifications

G06T11/10
Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/0895
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
G06N3/0455
Auto-encoder networks; Encoder-decoder networks · CPC title
G06T5/50Primary
using two or more images, e.g. averaging or subtraction · CPC title

Patent family

Related publications grouped by family.

View patent family 83193899

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11935217B2 cover?: The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately, efficiently, and flexibly generating harmonized digital images utilizing a self-supervised image harmonization neural network. In particular, the disclosed systems can implement, and learn parameters for, a self-supervised image harmonization neural network to extract content from one …
Who is the assignee on this patent?: Adobe Inc
What technology area does this patent fall under?: Primary CPC classification G06T5/50. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

End to End Network Model for High Resolution Image Segmentation

Unsupervised deformable registration for multi-modal images

Automatic object replacement in an image

Systems and methods for rendering avatars with deep appearance models

Harmonizing composite images using deep learning

Frequently asked questions