What technology area does this patent fall under?

Primary CPC classification G06T11/00. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Image generation using a text and image conditioned machine learning model

US12586259B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12586259-B2
Application number	US-202418426763-A
Country	US
Kind code	B2
Filing date	Jan 30, 2024
Priority date	Mar 20, 2023
Publication date	Mar 24, 2026
Grant date	Mar 24, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, apparatus, non-transitory computer readable medium, and system for image generation include obtaining a text embedding of a text prompt and an image embedding of an image prompt. Some embodiments map the text embedding into a joint embedding space to obtain a joint text embedding and map the image embedding into the joint embedding space to obtain a joint image embedding. Some embodiments generate a synthetic image based on the joint text embedding and the joint image embedding.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for image generation, comprising: obtaining a text embedding of a text prompt in a text embedding space and an image embedding of an image prompt in an image embedding space; mapping, using a text mapping network, the text embedding from the text embedding space into a joint embedding space to obtain a joint text embedding; mapping, using an image mapping network, the image embedding from the image embedding space into the joint embedding space to obtain a joint image embedding; and generating, using an image generation model, a synthetic image based on the joint text embedding and the joint image embedding. 2 . The method of claim 1 , further comprising: generating, using a generative adversarial network, a high-resolution version of the synthetic image. 3 . The method of claim 1 , wherein obtaining the text embedding and the image embedding comprises: encoding the text prompt with a text encoder to obtain the text embedding; and encoding the image prompt with an image encoder to obtain the image embedding. 4 . The method of claim 1 , further comprising: concatenating the joint text embedding and the joint image embedding to obtain a combined embedding. 5 . The method of claim 4 , wherein: the text embedding comprises n text tokens, where n is greater than one, the image embedding comprises a single image token, and the combined embedding comprises n+1 combined tokens. 6 . The method of claim 5 , wherein: each of the n text tokens has a dimensionality greater than the single image token. 7 . The method of claim 5 , wherein: each of the n+1 combined tokens has a same dimensionality as the n text tokens. 8 . The method of claim 1 , further comprising: learning a default text embedding for a null text prompt. 9 . The method of claim 1 , further comprising: learning a default image embedding for a null image prompt. 10 . A system for image generation, comprising: one or more processors; one or more memory components coupled with the one or more processors; a text mapping network comprising text mapping parameters, the text mapping network trained to map a text embedding from a text embedding space into a joint embedding space to obtain a joint text embedding; an image mapping network comprising image mapping parameters, the image mapping network trained to map an image embedding from an image embedding space into the joint embedding space to obtain a joint image embedding; and an image generation model comprising image generation parameters, the image generation model trained to generate a synthetic image based on the joint text embedding and the joint image embedding. 11 . The system of claim 10 , the system further comprising: a generative adversarial network (GAN) comprising GAN parameters, the GAN trained to generate a high-resolution version of the synthetic image. 12 . The system of claim 10 , wherein: the text mapping network comprises a multi-layer perceptron (MLP) architecture. 13 . The system of claim 10 , wherein: the image mapping network comprises a multi-layer perceptron (MLP) architecture. 14 . The system of claim 10 , the system further comprising: a text encoder comprising text encoding parameters, the text encoder trained to encode a text prompt to obtain the text embedding. 15 . The system of claim 14 , wherein: the text encoder is configured to learn a default text embedding for a null text prompt. 16 . The system of claim 10 , the system further comprising: an image encoder comprising image encoding parameters, the image encoder trained to encode an image prompt to obtain the image embedding. 17 . The system of claim 16 , wherein: the image encoder is configured to learn a default image embedding for a null image prompt. 18 . A non-transitory computer readable medium storing instructions that, when executed by a processor, cause the processor to: obtain a text embedding of a text prompt in a text embedding space and an image embedding of an image prompt in an image embedding space; map, using a text mapping network, the text embedding from the text embedding space into a joint embedding space to obtain a joint text embedding; map, using an image mapping network, the image embedding from the image embedding space into the joint embedding space to obtain a joint image embedding; and generate, using an image generation model, a synthetic image based on the joint text embedding and the joint image embedding. 19 . The non-transitory computer readable medium of claim 18 , wherein the instructions further cause the processor to: generate a high-resolution version of the synthetic image using a generative adversarial network (GAN). 20 . The non-transitory computer readable medium of claim 18 , wherein the instructions further cause the processor to: concatenate the joint text embedding and the joint image embedding to obtain a combined embedding.

Assignees

Adobe Inc

Inventors

Classifications

G06F40/284
Lexical analysis, e.g. tokenisation or collocates · CPC title
G06T2207/20081
Training; Learning · CPC title
G06T2207/20084
Artificial neural networks [ANN] · CPC title
G06F40/40
Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title
G06T11/00Primary
Two-dimensional [2D] image generation · CPC title

Patent family

Related publications grouped by family.

View patent family 92802827

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12586259B2 cover?: A method, apparatus, non-transitory computer readable medium, and system for image generation include obtaining a text embedding of a text prompt and an image embedding of an image prompt. Some embodiments map the text embedding into a joint embedding space to obtain a joint text embedding and map the image embedding into the joint embedding space to obtain a joint image embedding. Some embodim…
Who is the assignee on this patent?: Adobe Inc
What technology area does this patent fall under?: Primary CPC classification G06T11/00. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).