Generating modified digital images utilizing a global and spatial autoencoder

US2021358177A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021358177-A1
Application numberUS-202016874399-A
CountryUS
Kind codeA1
Filing dateMay 14, 2020
Priority dateMay 14, 2020
Publication dateNov 18, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure relates to systems, methods, and non-transitory computer readable media for generating a modified digital image from extracted spatial and global codes. For example, the disclosed systems can utilize a global and spatial autoencoder to extract spatial codes and global codes from digital images. The disclosed systems can further utilize the global and spatial autoencoder to generate a modified digital image by combining extracted spatial and global codes in various ways for various applications such as style swapping, style blending, and attribute editing.

First claim

Opening claim text (preview).

What is claimed is: 1 . A non-transitory computer readable medium comprising instructions that, when executed by at least one processor, cause a computing device to: extract from a digital image, utilizing an encoder neural network, a spatial code comprising features corresponding to a geometric layout of the digital image; extract from the digital image, utilizing the encoder neural network, a global code comprising features corresponding to an overall appearance of the digital image; generate one or more of an additional spatial code or an additional global code; and generate a modified digital image by combining, utilizing a generator neural network, the spatial code with the additional global code, the global code with the additional spatial code, or the additional spatial code with the additional global code. 2 . The non-transitory computer readable medium of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the computing device to extract the spatial code by passing intermediate features from layers of the encoder neural network into one or more layout blocks to increase spatial resolution and to decrease channel dimension. 3 . The non-transitory computer readable medium of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the one or more of the additional spatial code or the additional global code by: extracting, from a first set of digital images that depict an attribute, a first set of latent codes utilizing the encoder neural network; extracting, from a second set of digital images that do not depict the attribute, a second set of latent codes utilizing the encoder neural network; and generating an attribute direction by determining a difference between an average for the first set of latent codes and an average for the second set of latent codes. 4 . The non-transitory computer readable medium of claim 3 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the one or more of the additional spatial code or the additional global code by: modifying the spatial code of the digital image based on a spatial component of the attribute direction and a magnitude to generate the additional spatial code; and modifying the global code of the digital image and a global component of the attribute direction and the magnitude to generate the additional global code. 5 . The non-transitory computer readable medium of claim 4 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the modified digital image by combining, utilizing the generator neural network, the additional spatial code and the additional global code to construct the modified digital image. 6 . The non-transitory computer readable medium of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the one or more of the additional spatial code or the additional global code by extracting from an additional digital image, utilizing the encoder neural network: the additional spatial code comprising features corresponding to a geometric layout of the additional digital image; or the additional global code comprising features corresponding to the overall appearance of the additional digital image. 7 . The non-transitory computer readable medium of claim 5 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the modified digital image to include spatial features of the digital image and global features of the additional digital image or global features of the digital image and spatial features of the additional digital image. 8 . The non-transitory computer readable medium of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the one or more of the additional spatial code or the additional global code by: extracting, from a plurality of digital images, a plurality of global codes utilizing the encoder neural network; and generating a composite global code from the plurality of global codes. 9 . The non-transitory computer readable medium of claim 7 , further comprising instructions that, when executed by the at least one processor, cause the computing device to generate the modified digital image by combining the composite global code with the spatial code utilizing the generator neural network. 10 . The non-transitory computer readable medium of claim 7 , further comprising instructions that, when executed by the at least one processor, cause the computing device to: generate a modified composite global code by combining the composite global code with a global code from the digital image utilizing a slider variable for adjusting a relative weight between the composite global code and the global code from the first digital image; and generate the modified digital image by combining the modified composite global code with the spatial code. 11 . A system comprising: one or more memory devices comprising a first digital image, a second digital image, and a global and spatial autoencoder comprising an encoder neural network and a generator neural network; and one or more computing devices that are configured to cause the system to: extract from the first digital image, utilizing the encoder neural network: a first spatial code comprising features corresponding to a geometric layout of the first digital image; and a first global code comprising features corresponding to an overall appearance of the first digital image; extract from the second digital image, utilizing the encoder neural network: a second spatial code comprising features corresponding to a geometric layout of the second digital image; and a second global code comprising features corresponding to an overall appearance of the second digital image; and generate a modified digital image comprising features of the first digital image and features of the second digital image by combining, utilizing the generator neural network, the first spatial code with the second global code or the first global code with the second spatial code. 12 . The system of claim 11 , wherein the one or more computing devices are further configured to cause the system to learn parameters of the encoder neural network and the generator neural network by utilizing a contrastive loss to shift reconstructed spatial codes and reconstructed global codes from the modified digital image to be more similar to extracted spatial codes and extracted global codes from the first digital image and the second digital image than to stored spatial codes or stored global codes from a digital image code repository. 13 . The system of claim 11 , wherein the one or more computing devices are further configured to cause the system to extract the first spatial code by passing intermediate features from layers of the encoder neural network into one or more layout blocks to increase spatial resolution and to decrease channel dimension. 14 . The system of claim 11 , wherein the one or more computing devices are further configured to cause the system to extract the first global code by passing features of the first digital image through residual blocks of the encoder neural network to increase channel dimension and to decrease spatial resolution. 15 . The system of claim 11 , wherein the one or more computing devices

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • Adversarial learning · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021358177A1 cover?
The present disclosure relates to systems, methods, and non-transitory computer readable media for generating a modified digital image from extracted spatial and global codes. For example, the disclosed systems can utilize a global and spatial autoencoder to extract spatial codes and global codes from digital images. The disclosed systems can further utilize the global and spatial autoencoder t…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06T11/60. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Nov 18 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).