What technology area does this patent fall under?

Primary CPC classification G06T11/60. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 17 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Controllable diffusion model

US12555288B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12555288-B2
Application number	US-202318459526-A
Country	US
Kind code	B2
Filing date	Sep 1, 2023
Priority date	Sep 1, 2023
Publication date	Feb 17, 2026
Grant date	Feb 17, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, apparatus, and non-transitory computer readable medium for image generation are described. Embodiments of the present disclosure obtain a content input and a style input via a user interface or from a database. The content input includes a target spatial layout and the style input includes a target style. A content encoder of an image processing apparatus encodes the content input to obtain a spatial layout mask representing the target spatial layout. A style encoder of the image processing apparatus encodes the style input to obtain a style embedding representing the target style. An image generation model of the image processing apparatus generates an image based on the spatial layout mask and the style embedding, where the image includes the target spatial layout and the target style.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: obtaining a content input and a style input, wherein the content input comprises a target spatial layout and the style input comprises a target style; encoding, by a content encoder, the content input to obtain a spatial layout mask representing the target spatial layout; encoding, by a style encoder, the style input to obtain a style embedding representing the target style; and generating, by an image generation model, an image by denoising noisy features based on the spatial layout mask and the style embedding, wherein the image includes the target spatial layout and the target style. 2 . The method of claim 1 , wherein: the content input comprises a content image and the style input comprises a style image. 3 . The method of claim 1 , further comprising: performing a spatial-wise operation based on the spatial layout mask, wherein the image is generated based on the spatial-wise operation. 4 . The method of claim 1 , further comprising: performing a channel-wise operation based on the style embedding, wherein the image is generated based on the channel-wise operation. 5 . The method of claim 1 , further comprising: computing a content weight based on a diffusion timestep, wherein the image is generated based on the spatial layout mask according to the content weight. 6 . The method of claim 1 , further comprising: computing a style weight based on a diffusion timestep, wherein the image is generated based on the style embedding according to the style weight. 7 . The method of claim 1 , further comprising: generating a noise vector, wherein the image is generated based on the noise vector using a reverse diffusion process. 8 . The method of claim 1 , wherein: the style embedding includes global semantic information representing the target style. 9 . The method of claim 1 , wherein: the spatial layout mask comprises a plurality of values corresponding to a plurality of locations of the content input, respectively, and wherein the style embedding comprises a tuple of values that together represent the target style. 10 . A method comprising: initializing a content encoder, a style encoder, and an image generation model; receiving training data including an image comprising spatial content and a style attribute; computing an objective function based on the spatial content and the style attribute; and jointly training the content encoder, the style encoder, and the image generation model using an end-to-end process based on the objective function. 11 . The method of claim 10 , wherein: the content encoder is trained to generate a spatial layout mask representing a target spatial layout. 12 . The method of claim 10 , wherein: the style encoder is trained to generate a style embedding representing a target style. 13 . The method of claim 10 , wherein: the image generation model is trained to generate a predicted image including a target spatial layout and a target style based on an output of the content encoder and an output of the style encoder. 14 . The method of claim 10 , further comprising: generating a latent code based on the image using an image encoder; generating a noisy latent code based on the latent code using a forward diffusion process; and generating a predicted image using the image generation model, wherein the objective function is computed based on the predicted image. 15 . The method of claim 14 , further comprising: generating a predicted spatial layout mask using the content encoder; and generating a predicted style embedding using the style encoder, wherein the predicted image is generated based on the predicted spatial layout mask and the predicted style embedding. 16 . An apparatus comprising: at least one processor; at least one memory including instructions executable by the at least one processor; a content encoder comprising parameters stored in the at least one memory and trained to encode a content input to obtain a spatial layout mask representing a target spatial layout; a style encoder comprising parameters stored in the at least one memory and trained to encode a style input to obtain a style embedding representing a target style; and an image generation model comprising parameters stored in the at least one memory and trained to generate an image by denoising noisy features based on the spatial layout mask and the style embedding, wherein the image includes the target spatial layout and the target style. 17 . The apparatus of claim 16 , wherein: the content encoder and the style encoder each comprise a residual neural network. 18 . The apparatus of claim 16 , wherein: the image generation model comprises a denoising unit. 19 . The apparatus of claim 16 , further comprising: an image encoder configured to generate a latent code based on the image. 20 . The apparatus of claim 16 , further comprising: a timestep scheduling component configured to compute a content weight based on a diffusion timestep, wherein the image is generated based on the spatial layout mask according to the content weight, and to compute a style weight based on the diffusion timestep, wherein the image is generated based on the style embedding according to the style weight.

Assignees

Adobe Inc

Inventors

Classifications

G06T11/40
Filling planar surfaces by adding surface attributes, e.g. adding colours or textures · CPC title
G06T11/60Primary
Creating or editing images; Combining images with text · CPC title

Patent family

Related publications grouped by family.

View patent family 94773213

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12555288B2 cover?: A method, apparatus, and non-transitory computer readable medium for image generation are described. Embodiments of the present disclosure obtain a content input and a style input via a user interface or from a database. The content input includes a target spatial layout and the style input includes a target style. A content encoder of an image processing apparatus encodes the content input to …
Who is the assignee on this patent?: Adobe Inc
What technology area does this patent fall under?: Primary CPC classification G06T11/60. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 17 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Exemplar-based object appearance transfer driven by correspondence

System and method for generating images of the same style based on layout

Controlled style-content image generation based on disentangling content and style

Systems and methods for assessing item compatibility

Generating a stylized image or stylized animation by matching semantic features via an appearance guide, a segmentation guide, and/or a temporal guide

Frequently asked questions