Video enhancement method and apparatus, and electronic device and storage medium

US12190472B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12190472-B2
Application numberUS-202117630784-A
CountryUS
Kind codeB2
Filing dateMar 10, 2021
Priority dateApr 30, 2020
Publication dateJan 7, 2025
Grant dateJan 7, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A video enhancement method and apparatus, an electronic device, and a storage medium are described. The method comprises: extracting features from M frames of images, so as to obtain at least one first-scale image feature (S 310 ); for each first-scale image feature, performing N-level down-sampling processing on the first-scale image feature, so as to obtain a second-scale image feature (S 320 ); performing N-level up-sampling processing on the second-scale image feature, so as to obtain a third-scale image feature (S 330 ), wherein the input of ith-level up-sampling processing is an image feature obtained after performing superimposition processing on the output of (N+1−i)th-level down-sampling processing and the output of (i−1)th-level up-sampling processing, and the multiple of jth-level up-sampling is the same as the multiple of (N+1−j)th-level down-sampling; and performing superimposition processing on the third-scale image feature and the first-scale image feature.

First claim

Opening claim text (preview).

What is claimed is: 1. A video enhancement method, comprising: inputting M frames of images into a pre-established video processing model to obtain an enhanced image of at least one of the M frames of images, where M is an integer greater than 1, by: extracting features from the M frames of images to obtain at least one first-scale image feature; and for each first-scale image feature, performing the following: performing N-level down-sampling processing on the first-scale image feature to obtain a second-scale image feature, where N is an integer greater than 1; performing N-level up-sampling processing on the second-scale image feature to obtain a third-scale image feature, wherein an input of first-level up-sampling processing is the second-scale image feature, an input of ith-level up-sampling processing is an image feature obtained after performing superimposition processing on an output of (N+1−i)th-level down-sampling processing and an output of (i−1)th-level up-sampling processing, and a magnification of jth-level up-sampling processing is the same as a minification of (N+1−j)th-level down-sampling processing, where i is an integer between 2 and N, and j is an integer between 1 and N; and performing the superimposition processing on the third-scale image feature and the first-scale image feature to obtain an enhanced image corresponding to the first-scale image feature. 2. The method according to claim 1 , wherein: the pre-established video processing model is obtained by training an original video processing model through a target loss; the original video processing model is configured to perform video enhancement processing on an video input to the original video processing model; and the target loss comprises multi-level scale loss, and each-level scale loss of the multi-level scale loss is loss of each level of up-sampling processing in the N-level up-sampling processing. 3. The method according to claim 2 , wherein: the loss of each level of up-sampling processing is loss between a first image and a second image, and the first image is obtained by inputting M frames of sample images into the original video processing model for a corresponding level of up-sampling processing; and the second image is a target image of each level of up-sampling processing, and a resolution of the first image is the same as that of the second image. 4. The method according to claim 2 , wherein training the original video processing model to obtain the pre-established video processing model comprises: acquiring multiple groups of M frames of sample images and at least one frame of enhanced sample image corresponding to each group of M frames of sample images; for each group of M frames of sample images, extracting features from the group of M frames of sample images to obtain at least one first-scale sample image feature; for each first-scale sample image feature, performing the following procedures: performing the N-level down-sampling processing on the first-scale sample image feature to obtain a second-scale sample image feature; performing the N-level up-sampling processing on the second-scale sample image feature to obtain a predicted output image corresponding to each level of up-sampling; for each level of up-sampling, using a difference between a target output image corresponding to each level of up-sampling and a predicted output image corresponding to the level of up-sampling as loss of the level of up-sampling; wherein a target output image corresponding to ith-level up-sampling is an input of (N+1−i)th-level down-sampling processing on an enhanced sample image corresponding to the group of M frames of sample images; and using a sum of the loss of each level of up-sampling as the target loss, and updating a network parameter value in the original video processing model according to the target loss. 5. The method according to claim 4 , wherein each group of M frames of sample images corresponds to one frame of enhanced sample image, and the one frame of enhanced sample image is specifically an enhanced image corresponding to an intermediate frame of sample image of the group of M frames of sample images, where M is an odd number greater than 1. 6. The method according to claim 1 , wherein a value of M is 3, 5, or 7. 7. The method according to claim 6 , wherein before the inputting the M frames of images into the pre-established video processing model, the method further comprises: acquiring L frames of images in a video to be processed; adding frames of images respectively before a first frame of image and after the last frame of image of the L frames of images to obtain L+M−1 frames of images; dividing the L+M−1 frames of images into L groups of M frames of images, where L is an integer greater than M; and wherein, for each group of M frames of images, performing the step of inputting the M frames of images into the pre-established video processing model to obtain the enhanced image of the at least one of the M frames of images. 8. The method according to claim 1 , wherein the performing the superimposition processing on the third-scale image feature and the first-scale image feature to obtain the enhanced image corresponding to the first-scale image feature comprises: performing the superimposition processing on the third-scale image feature and the first-scale image feature to obtain a superimposed feature; and converting the superimposed feature into an image feature with three channels to obtain the enhanced image corresponding to the first-scale image feature. 9. The method according to claim 1 , wherein the performing the superimposition processing on the third-scale image feature and the first-scale image feature to obtain the enhanced image corresponding to the first-scale image feature comprises: performing super-resolution processing after performing the superimposition processing on the third-scale image feature and the first-scale image feature to obtain a super-resolution image corresponding to the first-scale image feature. 10. The method according to claim 1 , wherein a value of N is 4. 11. A video enhancement apparatus, comprising: an image enhancement processor configured to: input M frames of images into a pre-established video processing model to obtain an enhanced image of at least one of the M frames of images, where M is an integer greater than 1; extract features from the M frames of images to obtain at least one first-scale image feature; for each first-scale image feature, perform the following: performing N-level down-sampling processing on the first-scale image feature to obtain a second-scale image feature, where N is an integer greater than 1; performing N-level up-sampling processing on the second-scale image feature to obtain a third-scale image feature, wherein an input of first-level up-sampling processing is the second-scale image feature, an input of ith-level up-sampling processing is an image feature obtained after performing superimposition processing on an output of (N+1−i)th-level down-sampling processing and an output of (i−1)th-level up-sampling processing, and a magnification of jth-level up-sampling processing is the same as a minification of (N+1−j)th-level down-sampling processing, where i is an integer between 2 and N, and j is an integer between 1 and N; and performing the superimposition processing on the third-scale image feature and the first-scale image feature to obtain an enhanced image corresponding to the first-scale image feature. 12. The apparatus according to claim 11 , wherein: the pre-established video processing model is obtained by training an original video processing mo

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • using two or more images, e.g. averaging or subtraction · CPC title

  • using neural networks · CPC title

  • using machine learning, e.g. neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12190472B2 cover?
A video enhancement method and apparatus, an electronic device, and a storage medium are described. The method comprises: extracting features from M frames of images, so as to obtain at least one first-scale image feature (S 310 ); for each first-scale image feature, performing N-level down-sampling processing on the first-scale image feature, so as to obtain a second-scale image feature (S 320…
Who is the assignee on this patent?
Boe Technology Group Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06T5/70. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).