What technology area does this patent fall under?

Primary CPC classification G06T5/50. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 18 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Semantic-aware initial latent code selection for text-guided image editing and generation

US12254597B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12254597-B2
Application number	US-202217709221-A
Country	US
Kind code	B2
Filing date	Mar 30, 2022
Priority date	Mar 30, 2022
Publication date	Mar 18, 2025
Grant date	Mar 18, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An item recommendation system receives a set of recommendable items and a request to select, from the set of recommendable items, a contrast group. The item recommendation system selects a contrast group from the set of recommendable items by applying a image modification model to the set of recommendable items. The image modification model includes an item selection model configured to determine an unbiased conversion rate for each item of the set of recommendable items and select a recommended item from the set of recommendable items having a greatest unbiased conversion rate. The image modification model includes a contrast group selection model configured to select, for the recommended item, a contrast group comprising the recommended item and one or more contrast items. The item recommendation system transmits the contrast group responsive to the request.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving, by a request module, an input text and a request for a blended image; generating, by a contrastive language-image pre-training (“CLIP”) module for the input text, an input text CLIP code; selecting, by an initial latent code selection module, an initial latent code from among a set of latent codes, the selection based on a the initial latent code having a corresponding CLIP code with a greatest semantic similarity to the input text CLIP code; generating, by a latent code blending module, a blended image latent code by blending the initial latent code with an input image latent code determined for an input image; and generating, by a latent code generator module, the blended image from the blended image latent code; and transmitting, by the request module responsive to the request, the blended image. 2. The method of claim 1 , wherein the input text specifies target features for modifying the input image. 3. The method of claim 1 , wherein each latent code of the set of latent codes has a corresponding CLIP code. 4. The method of claim 3 , wherein each latent code of the set of latent codes and its corresponding CLIP code is generated from an image of a set of images. 5. The method of claim 1 , wherein the latent code generator module is further configured to generate the input image latent code based on the input image. 6. The method of claim 1 , wherein the latent code blending module comprises a StyleGAN synthesis network and wherein the latent code generator module comprises a StyleGAN encoder. 7. The method of claim 1 , wherein the initial latent code comprises a first set of layers, wherein the input image latent code comprises a second set of layers, wherein each layer of the first set of layers corresponds to a respective layer of the second set of layers, and wherein blending the initial latent code with the input image latent code comprises, blending each layer of the first set of layers with the corresponding respective layer of the second set of layers. 8. A system comprising: a request module configured to receive an input text, an input image, and a request for a blended image; a contrastive language-image pre-training (“CLIP”) module configured to generate, for the input text, an input text CLIP code; an initial latent code selection module configured to select an initial latent code from among a set of latent codes, the selection based on the initial latent code having a corresponding CLIP code with a greatest semantic similarity to the input text CLIP code; a latent code blending module configured to generate a blended image latent code by blending the initial latent code with an input image latent code determined for the input image; and a latent code generator module configured to generate the blended image from the blended image latent code, wherein the request module is further configured to transmit the blended image responsive to the request. 9. The system of claim 8 , wherein the input text specifies target features for modifying the input image. 10. The system of claim 8 , wherein each latent code of the set of latent codes has a corresponding CLIP code. 11. The system of claim 10 , wherein each latent code of the set of latent codes and its corresponding CLIP code is generated from an image of a set of images. 12. The system of claim 8 , wherein the latent code generator module is further configured to generate the input image latent code based on the input image. 13. The system of claim 8 , wherein the latent code blending module comprises a StyleGAN synthesis network and wherein the latent code generator module comprises a StyleGAN encoder. 14. The system of claim 8 , wherein the initial latent code comprises a first set of scales, wherein the input image latent code comprises a second set of scales, wherein each scale of the first set of scales corresponds to a respective scale of the second set of scales, and wherein blending the initial latent code with the input image latent code comprises, blending each scale of the first set of scales with the corresponding respective scale of the second set of scales. 15. A non-transitory computer-readable medium storing executable instructions, which when executed by a processing device, cause the processing device to perform operations comprising: receiving an input image, an input text, and a request for a blended image, wherein the input text specifies target features for modifying the input image; generating the blended image by applying an image modification model to an input image and the input text, wherein the image modification model comprises: a contrastive language-image pre-training (“CLIP”) model configured to generate, for the input text, an input text CLIP code; an initial latent code selection model configured to select an initial latent code from among a set of latent codes, the selection based on the initial latent code having a corresponding CLIP code with a greatest semantic similarity to the input text CLIP code; a latent code blending model configured to generate a blended image latent code by blending the initial latent code with an input image latent code determined for the input image; and a latent code generator model configured to generate the blended image from the blended image latent code; and transmitting, responsive to the request, the blended image. 16. The non-transitory computer-readable medium of claim 15 , wherein each latent code of the set of latent codes has a corresponding CLIP code. 17. The non-transitory computer-readable medium of claim 16 , wherein each latent code of the set of latent codes has a corresponding CLIP code. 18. The non-transitory computer-readable medium of claim 15 , wherein the latent code generator model is further configured to generate the input image latent code based on the input image. 19. The non-transitory computer-readable medium of claim 15 , wherein the latent code blending model comprises a StyleGAN synthesis network and wherein the latent code generator model comprises a StyleGAN encoder. 20. The non-transitory computer-readable medium of claim 15 , wherein the initial latent code comprises a first set of layers, wherein the input image latent code comprises a second set of layers, wherein each layer of the first set of layers corresponds to a respective layer of the second set of layers, and wherein blending the initial latent code with the input image latent code comprises, blending, each layer of the first set of layers with the corresponding respective layer of the second set of layers.

Assignees

Adobe Inc

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06T2207/20081
Training; Learning · CPC title
G06T2207/20084
Artificial neural networks [ANN] · CPC title
G06T2207/20221
Image fusion; Image merging · CPC title
G06T2200/24
involving graphical user interfaces [GUIs] · CPC title

Patent family

Related publications grouped by family.

View patent family 88193135

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12254597B2 cover?: An item recommendation system receives a set of recommendable items and a request to select, from the set of recommendable items, a contrast group. The item recommendation system selects a contrast group from the set of recommendable items by applying a image modification model to the set of recommendable items. The image modification model includes an item selection model configured to determi…
Who is the assignee on this patent?: Adobe Inc
What technology area does this patent fall under?: Primary CPC classification G06T5/50. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 18 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).