Method, electronic device, and computer program product for generating image

US12586368B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12586368-B2
Application numberUS-202318502704-A
CountryUS
Kind codeB2
Filing dateNov 6, 2023
Priority dateOct 13, 2023
Publication dateMar 24, 2026
Grant dateMar 24, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present disclosure relate to a method, an electronic device, and a computer program product for generating an image. The method includes acquiring a semantic segmentation graph by performing semantic segmentation on a source image. The method further includes acquiring a key word for describing a feature of a to-be-generated target image. The method further includes transforming the semantic segmentation graph by using the key word so as to acquire a transformed semantic segmentation graph. The method further includes generating the target image based on the transformed semantic segmentation graph. According to the method of embodiments of the present disclosure, a semantic segmentation graph of a source image and a key word can be used to generate a target image, so as to make the generated target image have a target feature and have semantic consistency with the source image, thereby generating a high-quality target image.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for generating an image, comprising: acquiring a semantic segmentation graph by performing semantic segmentation on a source image; acquiring a key word for describing a feature of a to-be-generated target image; transforming the semantic segmentation graph by using the key word so as to acquire a transformed semantic segmentation graph; and generating the target image based on the transformed semantic segmentation graph; wherein generating the target image based on the transformed semantic segmentation graph comprises determining the target image based on (i) comparison to an additional image using a first threshold and (ii) comparison to the source image using a second threshold. 2 . The method according to claim 1 , wherein the source image is an image in a training dataset, and the training dataset is used for training of a predetermined semantic segmentation model. 3 . The method according to claim 2 , further comprising: including the target image and the transformed semantic segmentation graph in the training dataset to enhance the training dataset, wherein the transformed semantic segmentation graph is used as annotation information of the target image. 4 . The method according to claim 2 , further comprising: mapping the source image and the key word to a predetermined feature space, wherein in the predetermined feature space, a distance between matched images and key words is less than a first predetermined distance, and a distance between mismatched images and key words is greater than a second predetermined distance; and wherein generating the target image based on the transformed semantic segmentation graph comprises that: when the generated target image is mapped to the predetermined feature space, a distance between the target image and the key word is less than the first predetermined distance, and a distance between the target image as well as the key word and the source image is greater than the second predetermined distance. 5 . The method according to claim 4 , wherein generating the target image based on the transformed semantic segmentation graph further comprises: making a difference between the generated target image and a real-world image less than a predetermined difference threshold, and making a similarity between the generated target image and the source image greater than a predetermined similarity threshold. 6 . The method according to claim 5 , wherein the method is executed by using a trained neural network model. 7 . The method according to claim 6 , wherein the trained neural network model comprises a first subnetwork model, a second subnetwork model, and a third subnetwork model, the first subnetwork model is used to map the source image and the key word to the predetermined feature space, the second subnetwork model is used to acquire the semantic segmentation graph by performing semantic segmentation on the source image, and the third subnetwork model is used to transform the semantic segmentation graph by using the key word so as to acquire the transformed semantic segmentation graph and generate the target image. 8 . The method according to claim 7 , further comprising: acquiring the first subnetwork model by training a first neural network model and a second neural network model, wherein the first neural network model is used to map an image to an image feature space, and the second neural network model is used to map a key word to a word feature space; acquiring a trained semantic segmentation model as the second subnetwork model, wherein the trained semantic segmentation model is different from the predetermined semantic segmentation model; and acquiring the third subnetwork model by training a third neural network model, and the third neural network model is based on a generative adversarial network (GAN) architecture. 9 . The method according to claim 8 , wherein training the first neural network model and the second neural network model comprises: performing joint training on the first neural network model and the second neural network model, so as to configure the trained first neural network model and second neural network model to map an input image and an input key word together to the predetermined feature space. 10 . The method according to claim 8 , wherein the third neural network model comprises a generator model and a discriminator model, the generator model is used to generate an output image based on an input semantic segmentation graph and an input key word, and the discriminator model is used to determine whether an image is a real-world image. 11 . The method according to claim 10 , wherein training the third neural network model comprises: performing joint training on the generator model and the discriminator model, so as to cause the trained discriminator model to determine an image having a difference from a real-world image less than the predetermined difference threshold as a real-world image, and cause the trained generator model to generate an output image meeting a predetermined condition, wherein the predetermined condition comprises that the output image is determined by the trained discriminator model as a real-world image. 12 . The method according to claim 11 , wherein the predetermined condition further comprises that: in the predetermined feature space, a distance between the output image and the input key word is less than the first predetermined distance; in the predetermined feature space, a distance between the output image as well as the input key word and an input image is greater than the second predetermined distance; and a similarity between the output image and the input image is greater than the predetermined similarity threshold. 13 . The method according to claim 1 , wherein the source image comprises at least one image, and the target image comprises at least one image corresponding to the source image. 14 . An electronic device, comprising: at least one processor; and memory coupled to the at least one processor and having instructions stored thereon, wherein the instructions, when executed by the at least one processor, cause the electronic device to perform actions comprising: acquiring a semantic segmentation graph by performing semantic segmentation on a source image; acquiring a key word for describing a feature of a to-be-generated target image; transforming the semantic segmentation graph by using the key word so as to acquire a transformed semantic segmentation graph; and generating the target image based on the transformed semantic segmentation graph; wherein generating the target image based on the transformed semantic segmentation graph comprises determining the target image based on (i) comparison to an additional image using a first threshold and (ii) comparison to the source image using a second threshold. 15 . The electronic device according to claim 14 , wherein the source image is an image in a training dataset, and the training dataset is used for training of a predetermined semantic segmentation model. 16 . The electronic device according to claim 15 , wherein the actions further comprise: including the target image and the transformed semantic segmentation graph in the training dataset to enhance the training dataset, wherein the transformed semantic segmentation graph is used as annotation information of the target image. 17 . The electronic device according to claim 15 , wherein the actions further comprise: mapping the source image and the key word to a pre

Assignees

Inventors

Classifications

  • Labelling scene content, e.g. deriving syntactic or semantic representations · CPC title

  • using neural networks · CPC title

  • Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • G06T11/00Primary

    Two-dimensional [2D] image generation · CPC title

  • G06V10/86Primary

    using syntactic or structural representations of the image or video pattern, e.g. symbolic string recognition; using graph matching · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12586368B2 cover?
Embodiments of the present disclosure relate to a method, an electronic device, and a computer program product for generating an image. The method includes acquiring a semantic segmentation graph by performing semantic segmentation on a source image. The method further includes acquiring a key word for describing a feature of a to-be-generated target image. The method further includes transform…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06T11/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).