Multi-stage multi-reference bootstrapping for video super-resolution

US12148123B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12148123-B2
Application numberUS-202017607486-A
CountryUS
Kind codeB2
Filing dateApr 28, 2020
Priority dateMay 3, 2019
Publication dateNov 19, 2024
Grant dateNov 19, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An embodiment method includes performing first convolutional filtering on a first tensor constructed using a current frame and reference frames (or digital world reference images) of the current frame in a video, to generate a first estimated image of the current frame having a higher resolution than an image of the current frame. The method also includes performing second convolutional filtering on a second tensor constructed using the first estimated image and estimated reference images of the reference frames, to generate a second estimated image of the current having a higher resolution than the image of the current frame. The estimated reference images of the reference frames are reconstructed high resolution images of the reference images.

First claim

Opening claim text (preview).

What is claimed: 1. A computer-implemented method, comprising: obtaining a current frame and a plurality of reference frames of the current frame in a video; constructing a first tensor, wherein the first tensor is a first feature map constructed using the current frame and the plurality of reference frames; performing first convolutional filtering on the first tensor, to generate a first estimated image of the current frame; obtaining estimated reference images of the plurality of reference frames, the estimated reference images having a higher resolution than images of the plurality of reference frames; and performing second convolutional filtering on a second tensor constructed using the first estimated image of the current frame and the estimated reference images of the plurality of reference frames, to generate a second estimated image of the current frame. 2. The computer-implemented method of claim 1 , wherein the plurality of reference frames of the current frame comprises preceding frames of the current frame. 3. The computer-implemented method of claim 1 , wherein the plurality of reference frames of the current frame comprises frames preceding the current frame and frames subsequent to the current frame. 4. The computer-implemented method of claim 1 , the constructing the first tensor comprising: before the performing the first convolutional filtering: determining an expansion region in a reference frame of the plurality of reference frames, the expansion region corresponding to a region in the current frame, and the expansion region in the reference frame comprising an enlarged scene of the region in the current frame; assigning a utility score to each pixel of the reference frame based on whether or not each pixel of the reference frame belongs to the expansion region, thereby generating a utility mask of the reference frame, the utility mask comprising a set of utility scores for pixels of the reference frame; and constructing the first tensor using the current frame, the plurality of reference frames, and the utility mask of the reference frame. 5. The computer-implemented method of claim 4 , further comprising: generating a scene flow using the current frame and the plurality of reference frames, the scene flow comprising images of the plurality of reference frames that are motion compensated based on an image of the current frame; and generating a flow map for each of the plurality of reference frames, wherein the determining the expansion region in the reference frame is based on the scene flow and the flow map. 6. The computer-implemented method of claim 5 , wherein the generating the scene flow comprises: generating the scene flow using the current frame, the plurality of reference frames, and a digital world reference image of the current frame. 7. The computer-implemented method of claim 6 , wherein the digital world reference image is obtained from a digital world image database. 8. The computer-implemented method of claim 7 , further comprising: obtaining visual positioning system (VPS) information of the current frame; and searching for the digital world reference image in the digital world image database according to the VPS information. 9. The computer-implemented method of claim 6 , further comprising: resizing the digital world reference image so that the digital world reference image has a same size as the image of the current frame. 10. The computer-implemented method of claim 4 , further comprising: generating a corresponding utility mask for each of the plurality of reference frames; and constructing the first tensor using the current frame, the plurality of reference frames and utility masks of the plurality of reference frames. 11. The computer-implemented method of claim 10 , wherein the constructing the first tensor comprises: ordering the current frame and the plurality of reference frames according to a sequence of the current frame and the plurality of reference frames; and ordering the utility masks of the plurality of reference frames according to the sequence. 12. The computer-implemented method of claim 10 , wherein the constructing the first tensor comprises: multiplying values of pixels of each of the plurality of reference frames and the corresponding utility mask of each of the plurality of reference frames. 13. The computer-implemented method of claim 1 , wherein the first estimated image or the second estimated image has a larger size than an image of the current frame. 14. The computer-implemented method claim 1 , further comprising: before the performing the second convolutional filtering: determining an expansion region in an estimated reference image of the plurality of reference frames, the expansion region corresponding to a region in the first estimated image, and the expansion region in the estimated reference image comprising an enlarged scene of the region in the first estimated image; assigning a utility score to each pixel of the estimated reference image based on whether or not each pixel of the estimated reference image belongs to the expansion region, thereby generating a utility mask of the estimated reference image, the utility mask of the estimated reference image comprising a set of utility scores for pixels of the estimated reference image; and constructing the second tensor using the first estimated image, the estimated reference images of the plurality of reference frames and the utility mask of the estimated reference image. 15. The computer-implemented method of claim 14 , further comprising: generating a scene flow using the first estimated image and the estimated reference images of the plurality of reference frames, the scene flow comprising images of the estimated reference images that are motion compensated based on the first estimated image; and generating a flow map for each of the estimated reference images; and wherein the determining the expansion region in the estimated reference image is based on the scene flow and the flow map. 16. The computer-implemented method of claim 14 , further comprising: generating a respective utility mask for each of the estimated reference images. 17. The computer-implemented method of claim 16 , further comprising: constructing the second tensor using the first estimated image, the estimated reference images of the plurality of reference frames and utility masks of the estimated reference images. 18. The computer-implemented method of claim 17 , further comprising: performing convolutional filtering on the estimated first image and the estimated reference images of the plurality of reference frames, thereby generating a second feature map of the estimated first image and the estimated reference images, wherein the constructing the second tensor comprises: constructing the second tensor using the second feature map, the utility masks of the estimated reference images, and the first tensor. 19. A computer-implemented method, comprising: obtaining a current frame and a plurality of reference frames of the current frame in a video; determining an expansion region in a reference frame of the plurality of reference frames, the expansion region corresponding to a region in the current frame, and the expansion region in the reference frame comprising an enlarged scene relative to the region in the current frame; assigning a utility score to each pixel of the reference frame based on whether or not each pixel of the reference frame belongs to the expansion region, thereby g

Assignees

Inventors

Classifications

  • Artificial neural networks [ANN] · CPC title

  • Video; Image sequence · CPC title

  • using two or more images, e.g. averaging or subtraction · CPC title

  • using local operators · CPC title

  • involving reference images or patches · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12148123B2 cover?
An embodiment method includes performing first convolutional filtering on a first tensor constructed using a current frame and reference frames (or digital world reference images) of the current frame in a video, to generate a first estimated image of the current frame having a higher resolution than an image of the current frame. The method also includes performing second convolutional filteri…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06T3/4053. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).