Method of training image generation model, and method of generating image

US12406472B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12406472-B2
Application numberUS-202218086556-A
CountryUS
Kind codeB2
Filing dateDec 21, 2022
Priority dateDec 23, 2021
Publication dateSep 2, 2025
Grant dateSep 2, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of training an image generation model, and a method of generating an image. A specific implementation solution includes: acquiring a first image sample and a first text sample matched with the first image sample; performing an enhancement on at least one type of sample in the first image sample and the first text sample according to a predetermined knowledge graph, so as to obtain at least one type of sample in a second image sample obtained by the enhancement and a second text sample obtained by the enhancement; and training the image generation model according to a training set selected from a first training set, a second training set or a third training set, until the image generation model converges.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of training an image generation model, the method comprising: acquiring a first image sample and a first text sample matched with the first image sample; performing an enhancement on at least one type of sample in the first image sample and the first text sample according to a predetermined knowledge graph, so as to obtain at least one type of sample in a second image sample obtained by the enhancement and a second text sample obtained by the enhancement; and training the image generation model according to a training set selected from a first training set, a second training set or a third training set, until the image generation model converges, wherein the first training set comprises the second image sample and the first text sample, the second training set comprises the first image sample and the second text sample, and the third training set comprises the second image sample and the second text sample. 2. The method according to claim 1 , wherein the performing an enhancement on at least one type of sample in the first image sample and the first text sample according to a predetermined knowledge graph, so as to obtain at least one type of sample in an enhanced second image sample and an enhanced second text sample comprises: acquiring a first entity information of the first image sample, and acquiring first knowledge data matched with the first entity information from the knowledge graph; and updating the first entity information of the first image sample according to the first knowledge data, so as to obtain the second image sample. 3. The method according to claim 1 , wherein the performing an enhancement on at least one type of sample in the first image sample and the first text sample according to a predetermined knowledge graph, so as to obtain at least one type of sample in an enhanced second image sample and an enhanced second text sample comprises: acquiring a second entity information of the first text sample, and acquiring second knowledge data matched with the second entity information from the knowledge graph; and updating the second entity information of the first text sample according to the second knowledge data, so as to obtain the second text sample. 4. The method according to claim 1 , wherein the performing an enhancement on at least one type of sample in the first image sample and the first text sample according to a predetermined knowledge graph, so as to obtain at least one type of sample in an enhanced second image sample and an enhanced second text sample comprises: performing an expression enhancement on the first text sample; acquiring a third entity information of the first text sample obtained by the expression enhancement, and acquiring third knowledge data matched with the third entity information from the knowledge graph; and updating the third entity information of the first text sample according to the third knowledge data, so as to obtain the second text sample. 5. The method according to claim 1 , wherein the training the image generation model according to a training set selected from a first training set, a second training set or a third training set comprises: aggregating at least one type of sample selected from an image sample and a text sample in the training set, so as to obtain at least one type of sample group selected from an aggregated image sample group and an aggregated text sample group, wherein the image sample is the first image sample or the second image sample, and the text sample is the first text sample or the second text sample; updating, for each specified sample in the sample group, a matching relationship of the specified sample according to a text sample or an image sample matched with other sample in the sample group except the specified sample; and training the image generation model according to an updated training set. 6. The method according to claim 5 , wherein the aggregating at least one type of sample selected from an image sample and a text sample in the training set comprises determining, for the training set, a plurality of image samples having a same entity information, and aggregating the plurality of image samples into one image sample group. 7. The method according to claim 6 , wherein the determining a plurality of image samples having a same entity information comprises: determining whether an entity quantity of each image sample is greater than a predetermined entity quantity threshold or not; and determining the plurality of image samples having the same entity information from each image sample having the entity quantity greater than the entity quantity threshold. 8. The method according to claim 5 , wherein the aggregating at least one type of samples selected from an image sample and a text sample in the training set comprises determining, for the training set, a plurality of text samples having a same entity information, and aggregating the plurality of text samples into one text sample group. 9. The method according to claim 8 , wherein the determining a plurality of text samples having a same entity information comprises: determining whether an entity quantity of each text sample is greater than a predetermined entity quantity threshold or not; and determining the plurality of text samples having the same entity information from each text sample having the entity quantity greater than the entity quantity threshold. 10. The method according to claim 5 , wherein the updating a matching relationship of the specified sample according to a text sample or an image sample matched with other sample in the sample group except the specified sample comprises: randomly selecting a first specified number of text sample or image sample from the text sample or the image sample matched with the other sample as an enhancement sample; and establishing a matching relationship between the enhancement sample and the specified sample. 11. The method according to claim 5 , wherein the updating a matching relationship of the specified sample according to a text sample or an image sample matched with other sample in the sample group except the specified sample comprises: determining a matching degree between the text sample or the image sample matched with the other sample and the specified sample; selecting a text sample or an image sample having a matching degree greater than a predetermined matching degree threshold from the text sample or the image sample corresponding to the other sample as an enhancement sample; and establishing a matching relationship between the enhancement sample and the specified sample. 12. The method according to claim 5 , wherein the training the image generation model according to an updated training set comprises: determining a sample pair from each sample pair having a matching relationship in the updated training set, wherein a similarity between the entity information of the image sample and the entity information of the text sample in the sample pair is greater than a predetermined similarity threshold, and training the image generation model according to each sample pair corresponding to a similarity greater than the similarity threshold. 13. The method according to claim 12 , wherein the training the image generation model according to each sample pair corresponding to a similarity greater than the similarity threshold comprises: determining a sample pair from each sample pair corresponding to a similarity greater than the similarity threshold, and training the image generation model according to the sample pair, wherein the entity quantity in the image sample in the sample pair is greater th

Assignees

Inventors

Classifications

  • Proximity, similarity or dissimilarity measures · CPC title

  • Active pattern-learning, e.g. online learning of image or video features · CPC title

  • G06V10/771Primary

    Feature selection, e.g. selecting representative features from a multi-dimensional feature space · CPC title

  • G06V10/774Primary

    Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12406472B2 cover?
A method of training an image generation model, and a method of generating an image. A specific implementation solution includes: acquiring a first image sample and a first text sample matched with the first image sample; performing an enhancement on at least one type of sample in the first image sample and the first text sample according to a predetermined knowledge graph, so as to obtain at l…
Who is the assignee on this patent?
Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06V10/771. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).