Semantic mixing and style transfer utilizing a composable diffusion neural network

US2025095114A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025095114-A1
Application numberUS-202318470240-A
CountryUS
Kind codeA1
Filing dateSep 19, 2023
Priority dateSep 19, 2023
Publication dateMar 20, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure relates to systems, non-transitory computer-readable media, and methods for generating digital images by conditioning a diffusion neural network with input prompts. In particular, in one or more embodiments, the disclosed systems generate, utilizing a reverse diffusion model, an image noise representation from a first image prompt. Additionally, in some embodiments, the disclosed systems generate, utilizing a diffusion neural network conditioned with a first vector representation of the first image prompt, a first denoised image representation from the image noise representation. Moreover, in some embodiments, the disclosed systems generate, utilizing the diffusion neural network conditioned with a second vector representation of a second image prompt, a second denoised image representation from the image noise representation. Furthermore, in some embodiments, the disclosed systems combine the first denoised image representation and the second denoised image representation to generate a digital image.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method comprising: generating, utilizing a reverse diffusion model, an image noise representation from a first image prompt; generating, utilizing a diffusion neural network conditioned with a first vector representation of the first image prompt, a first denoised image representation from the image noise representation; generating, utilizing the diffusion neural network conditioned with a second vector representation of a second image prompt, a second denoised image representation from the image noise representation; and combining the first denoised image representation and the second denoised image representation to generate a digital image. 2 . The computer-implemented method of claim 1 , wherein combining the first denoised image representation and the second denoised image representation comprises assigning a first weight to the first denoised image representation and a second weight to the second denoised image representation. 3 . The computer-implemented method of claim 2 , further comprising: providing, for display via a user interface of a client device, a weight control element; and determining the first weight and the second weight based on a user interaction with the weight control element. 4 . The computer-implemented method of claim 2 , further comprising: combining the first denoised image representation and the second denoised image representation at a first denoising iteration of the diffusion neural network; and combining, at a second denoising iteration of the diffusion neural network, a third denoised image representation and a fourth denoised image representation by assigning a third weight to the third denoised image representation and a fourth weight to the fourth denoised image representation. 5 . The computer-implemented method of claim 2 , further comprising: determining a function of weights defining a plurality of weights for combining denoised image representations across a plurality of denoising iterations of the diffusion neural network; and combining the first denoised image representation and the second denoised image representation by determining the first weight and the second weight from the function of weights. 6 . The computer-implemented method of claim 1 , wherein generating the image noise representation from the first image prompt comprises at least one of: generating the image noise representation utilizing a deterministic reverse diffusion model; or generating the image noise representation utilizing a stochastic reverse diffusion model. 7 . The computer-implemented method of claim 1 , wherein generating the first denoised image representation and the second denoised image representation comprises: generating, utilizing an embedding model, the first vector representation from the first image prompt; and generating, utilizing the embedding model, the second vector representation from the second image prompt. 8 . The computer-implemented method of claim 7 , further comprising: combining the first denoised image representation and the second denoised image representation by generating a combined denoised image representation; generating, utilizing the diffusion neural network conditioned with the first vector representation, a third denoised image representation from the combined denoised image representation; generating, utilizing the diffusion neural network conditioned with the second vector representation, a fourth denoised image representation from the combined denoised image representation; and combining the third denoised image representation and the fourth denoised image representation by generating an additional combined denoised image representation. 9 . The computer-implemented method of claim 1 , further comprising: combining the first denoised image representation and the second denoised image representation by generating, for a first denoising iteration of the diffusion neural network, a combined denoised image representation of the first image prompt and the second image prompt; generating, utilizing a second denoising iteration of the diffusion neural network, a third denoised image representation from the combined denoised image representation; and generating, utilizing the second denoising iteration of the diffusion neural network, a fourth denoised image representation from the combined denoised image representation. 10 . A system comprising: one or more memory devices comprising a first prompt, a second prompt, a reverse diffusion model, and a diffusion neural network; and one or more processors configured to cause the system to: generate, utilizing the reverse diffusion model, a noise representation of the first prompt; generate, utilizing an embedding model, a first vector representation of the first prompt and a second vector representation of the second prompt; generate a first denoised image representation from the noise representation of the first prompt utilizing the diffusion neural network conditioned with the first vector representation of the first prompt; generate a second denoised image representation from the noise representation of the first prompt utilizing the diffusion neural network conditioned with the second vector representation of the second prompt; and combine the first denoised image representation and the second denoised image representation to generate a digital image. 11 . The system of claim 10 , wherein the one or more processors are further configured to cause the system to: receive a user interaction with a weight control element via a user interface of a client device; determine, based on the user interaction with the weight control element, a first weight for the first denoised image representation and a second weight for the second denoised image representation; and combine the first denoised image representation and the second denoised image representation according to the first weight and the second weight. 12 . The system of claim 10 , wherein the one or more processors are further configured to cause the system to: generate a third denoised image representation from a combined denoised image representation of the first prompt and the second prompt, utilizing the diffusion neural network conditioned with the first vector representation of the first prompt; generate a fourth denoised image representation from the combined denoised image representation, utilizing the diffusion neural network conditioned with the second vector representation of the second prompt; and combine the third denoised image representation and the fourth denoised image representation to generate an additional combined denoised image representation of the first prompt and the second prompt. 13 . The system of claim 10 , wherein the one or more processors are further configured to cause the system to: combine the first denoised image representation and the second denoised image representation to generate a combined denoised image representation of the first prompt and the second prompt in a first denoising iteration of the diffusion neural network; generate a third denoised image representation from the combined denoised image representation utilizing a second denoising iteration of the diffusion neural network; generate a fourth denoised image representation from the combined denoised image representation utilizing the second denoising iteration of the diffusion neural network; and combine the third denoised image representation and the fourth denoised image representation to generate an additional combined denoised image representation of the first prompt and the

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025095114A1 cover?
The present disclosure relates to systems, non-transitory computer-readable media, and methods for generating digital images by conditioning a diffusion neural network with input prompts. In particular, in one or more embodiments, the disclosed systems generate, utilizing a reverse diffusion model, an image noise representation from a first image prompt. Additionally, in some embodiments, the d…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06T5/70. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Mar 20 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).