Method, electronic device, and computer program product for training image processing model

US12586152B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12586152-B2
Application numberUS-202318130022-A
CountryUS
Kind codeB2
Filing dateApr 3, 2023
Priority dateFeb 28, 2023
Publication dateMar 24, 2026
Grant dateMar 24, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present disclosure relate to a method, an electronic device, and a computer program product for training an image processing model. The method in an illustrative embodiment includes: obtaining a folding weight of a folded convolutional layer of a pre-trained generator by performing a folding operation on a plurality of weights of a plurality of convolutional layers of the pre-trained generator. The method further includes: embedding the pre-trained generator into the image processing model. The method further includes: training the image processing model using a plurality of pairs of sample images, wherein at least one pair of sample images of the plurality of pairs of sample images includes a first sample image having a first resolution and a second sample image having a second resolution, and wherein the first resolution is less than the second resolution.

First claim

Opening claim text (preview).

What is claimed is: 1 . A training method for an image processing model, the method comprising: embedding a pre-trained generator into the image processing model; and training the image processing model using a plurality of pairs of sample images, wherein at least one pair of sample images of the plurality of pairs of sample images comprises a first sample image having a first resolution and a second sample image having a second resolution, and wherein the first resolution is less than the second resolution; wherein training the image processing model using a plurality of pairs of sample images comprises: inputting the first sample image to an encoder in the image processing model; obtaining mapped sample image features by mapping sample image features output by the encoder; inputting the mapped sample image features to the pre-trained generator to obtain a first output image; and training the image processing model based on the first sample image, the first output image, and the second sample image; wherein training the image processing model based on the first sample image, the first output image, and the second sample image further comprises: determining an adversarial loss function for the image processing model using a discriminator corresponding to the pre-trained generator based on the first output image and the second sample image; determining at least one of (i) a first loss function for the image processing model based on the first sample image and the second sample image, and (ii) a second loss function for the image processing model using the pre-trained generator and the discriminator based on the first sample image and the second sample image; and constructing a weighted loss function using the adversarial loss function and the at least one of the first loss function and the second loss function to train the image processing model. 2 . The method according to claim 1 , further comprising: extracting features of a pre-training sample image; performing a mapping operation on the extracted image features to obtain mapped pre-training sample image features; inputting the mapped pre-training sample image features to an initial generator, the initial generator comprising a plurality of convolutional layers; inputting pre-training noise to the initial generator, wherein the noise is cascaded with an output of at least one of the plurality of convolutional layers in the initial generator, and the initial generator generates a pre-training output image; and pre-training the initial generator based on the pre-training sample image and the pre-training output image to obtain the pre-trained generator. 3 . The method according to claim 2 , wherein the initial generator further comprises a first linear block and a second linear block for scaling up the pre-training output image, the plurality of convolutional layers being disposed between the first linear block and the second linear block. 4 . The method according to claim 1 , wherein the size of the second sample image is greater than that of the first sample image. 5 . The method according to claim 1 , wherein the image processing model comprises a U-network. 6 . The method according to claim 1 , wherein the sample image features are output from a fully connected layer of the encoder, and wherein the pre-trained generator comprises a plurality of generator blocks, the training method further comprising: inputting outputs of a plurality of layers preceding the fully connected layer in the encoder to the plurality of generator blocks in the pre-trained generator, respectively. 7 . The method according to claim 1 , further comprising obtaining a folding weight of a folded convolutional layer of the pre-trained generator by performing a folding operation on a plurality of weights of a plurality of convolutional layers of the pre-trained generator. 8 . The method according to claim 7 , wherein obtaining the folding weight of the folded convolutional layer of the pre-trained generator comprises: performing a multiplication operation on the plurality of weights of the plurality of convolutional layers of the pre-trained generator; and using the result of the multiplication operation as the folding weight of the folded convolutional layer of the pre-trained generator. 9 . The method according to claim 7 , wherein obtaining the folding weight of the folded convolutional layer of the pre-trained generator comprises: performing a multiplication operation on the plurality of weights of the plurality of convolutional layers of the pre-trained generator to obtain a first weight; performing a summation operation on a residual weight of the pre-trained generator and the first weight to obtain a second weight; and using the second weight as the folding weight of the folded convolutional layer of the pre-trained generator. 10 . An electronic device, comprising: at least one processor; and at least one memory, the at least one memory being coupled to the at least one processor and storing instructions for execution by the at least one processor, wherein the instructions, when executed by the at least one processor, cause the electronic device to perform actions comprising: embedding a pre-trained generator into an image processing model; and training the image processing model using a plurality of pairs of sample images, wherein at least one pair of sample images of the plurality of pairs of sample images comprises a first sample image having a first resolution and a second sample image having a second resolution, and wherein the first resolution is less than the second resolution; wherein training the image processing model using a plurality of pairs of sample images comprises: inputting the first sample image to an encoder in the image processing model; obtaining mapped sample image features by mapping sample image features output by the encoder; inputting the mapped sample image features to the pre-trained generator to obtain a first output image; and training the image processing model based on the first sample image, the first output image, and the second sample image; wherein training the image processing model based on the first sample image, the first output image, and the second sample image further comprises: determining an adversarial loss function for the image processing model using a discriminator corresponding to the pre-trained generator based on the first output image and the second sample image; determining at least one of (i) a first loss function for the image processing model based on the first sample image and the second sample image, and (ii) a second loss function for the image processing model using the pre-trained generator and the discriminator based on the first sample image and the second sample image; and constructing a weighted loss function using the adversarial loss function and the at least one of the first loss function and the second loss function to train the image processing model. 11 . The electronic device according to claim 10 , wherein the actions further comprise: extracting features of a pre-training sample image; performing a mapping operation on the extracted image features to obtain mapped pre-training sample image features; inputting the mapped pre-training sample image features to an initial generator, the initial generator comprising a plurality of convolutional layers; inputting pre-training noise to the initial generator, wherein the noise is cascaded with an output of at least one of the plurality of convolutional layers in the initial generator, and the initial generator generates a pre-training output image; and pre-training the initial

Assignees

Inventors

Classifications

  • G06N3/0464Primary

    Convolutional networks [CNN, ConvNet] · CPC title

  • Adversarial learning · CPC title

  • Backpropagation, e.g. using gradient descent · CPC title

  • Non-supervised learning, e.g. competitive learning · CPC title

  • Probabilistic or stochastic networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12586152B2 cover?
Embodiments of the present disclosure relate to a method, an electronic device, and a computer program product for training an image processing model. The method in an illustrative embodiment includes: obtaining a folding weight of a folded convolutional layer of a pre-trained generator by performing a folding operation on a plurality of weights of a plurality of convolutional layers of the pre…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06N3/0464. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).