Method, device, and medium for generating super-resolution video

US11778223B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11778223-B2
Application numberUS-202117406845-A
CountryUS
Kind codeB2
Filing dateAug 19, 2021
Priority dateAug 19, 2021
Publication dateOct 3, 2023
Grant dateOct 3, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, device and computer-readable medium for generating a super-resolution version of a compressed video stream. By leveraging the motion information and residual information in compressed video streams, described examples are able to skip the time-consuming motion-estimation step for most frames and make the most use of the SR results of key frames. A key frame SR module generates SR versions of I-frames and other key frames of a compressed video stream using techniques similar to existing multi-frame approaches to VSR. A non-key frame SR module generates SR version of the non-key inter frames between these key frames by making use of motion information and residual information used to encode the inter frames in the compressed video stream.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for generating a super-resolution version of a compressed video stream, the method comprising: obtaining at least a portion of the compressed video stream comprising a plurality of frame encodings representative of a temporal sequence of frames, the plurality of frame encodings comprising: an intra-coded frame (I-frame) encoding representative of an I-frame; and a first inter frame encoding representative of a first inter frame subsequent to the I-frame in the temporal sequence, comprising: motion information of the first inter frame relative to the I-frame; and residual information of the first inter frame relative to the I-frame; decoding the I-frame encoding to generate the I-frame; decoding the first inter frame encoding to generate: the first inter frame; the motion information of the first inter frame; and the residual information of the first inter frame; processing the I-frame to generate a super-resolution version of the I-frame; and generating a super-resolution version of the first inter frame by processing: the first inter frame; the motion information of the first inter frame; the residual information of the first inter frame; the I-frame; and the super-resolution version of the I-frame, wherein the plurality of frame encodings further comprises a second inter frame encoding representative of a second inter frame subsequent to the first inter frame in the temporal sequence, comprising: motion information of the second inter frame relative to the first inter frame; and residual information of the second inter frame relative to the first inter frame; the method further comprising: decoding the second inter frame encoding to generate: the second inter frame; the motion information of the second inter frame; and the residual information of the second inter frame; and generating a super-resolution version of the second inter frame by processing: the second inter frame; the motion information of the second inter frame; the residual information of the second inter frame; the first inter frame; and the super-resolution version of the first inter frame, and wherein processing the I-frame to generate a super-resolution version of the I-frame comprises: generating a super-resolution version of the I-frame by processing: the I-frame; the first inter frame; and an additional frame decoded from the compressed video stream, the additional frame being prior to the temporal sequence of frames. 2. The method of claim 1 , wherein: the plurality of frame encodings further comprises a further inter frame encoding representative of a further inter frame subsequent to the second inter frame in the temporal sequence; further comprising: decoding the further inter frame encoding to generate: the further inter frame; identifying the further inter frame as a key frame; and generating a super-resolution version of the further inter frame by processing: the further inter frame; at least one frame, prior to the further inter frame in the temporal sequence, decoded from the compressed video stream; and at least one frame, subsequent to the further inter frame in the temporal sequence, decoded from the compressed video stream. 3. The method of claim 1 , wherein the super-resolution version of the first inter frame is generated by: processing the I-frame, the first inter frame, the motion information of the first inter frame, and the residual information of the first inter frame to generate a refined motion map; and processing the refined motion map, the super-resolution version of the I-frame, and the residual information of the first inter frame to generate the super-resolution version of the first inter frame. 4. The method of claim 3 , wherein the refined motion map is generated by: warping the I-frame using the motion information of the first inter frame to generate a warped I-frame; concatenating the first inter frame, the motion information of the first inter frame, the residual information of the first inter frame, and the warped I-frame to generate a concatenated tensor; processing the concatenated tensor using a first convolution layer of a MV refining convolutional neural network (CNN); processing the output of the first convolution layer using a first residual dense block of the MV refining CNN; processing the output of the first residual dense block using one or more inter convolution layers and one or more inter residual dense blocks of the MV refining CNN to generate a MV refining CNN output tensor; reshaping the MV refining CNN output tensor using a pixel shuffling operation to generate a reshaped MV refining tensor; up-sampling the motion information of the first inter frame to generate up-sampled motion information; and processing the reshaped MV refining tensor and the up-sampled motion information to generate the refined motion map. 5. The method of claim 3 , wherein processing the refined motion map, the super-resolution version of the I-frame, and the residual information of the first inter frame to generate the super-resolution version of the first inter frame comprises: processing the refined motion map and the super-resolution version of the I-frame to generate a warped high-frequency feature map; processing the first inter frame to generate a feature map of the first inter frame; processing the feature map of the first inter frame, the warped high-frequency feature map, and the residual information of the first inter frame to generate a fused feature map; and processing the fused feature map and the first inter frame to generate the super-resolution version of the first inter frame. 6. The method of claim 5 , wherein the warped high-frequency feature map is generated by: processing the super-resolution version of the I-frame using one or more convolution layers to generate a HF feature tensor; warping the HF feature tensor using the refined motion map to generate a warped HF feature tensor; and reshaping the warped HF feature tensor using a pixel unshuffling operation to generate the warped high-frequency feature map. 7. The method of claim 3 , wherein processing the refined motion map, the super-resolution version of the I-frame, and the residual information of the first inter frame to generate the super-resolution version of the first inter frame comprises: warping the super-resolution version of the I-frame using the refined motion map to generate a warped super-resolution reference frame; up-sampling the residual information of the first inter frame to generate up-sampled residual information; and processing the warped super-resolution reference frame and the up-sampled residual information to generate the super-resolution version of the first inter frame. 8. A device, comprising: a processor; and a memory storing instructions which, when executed by the processor, cause the device to generate a super-resolution version of a compressed video stream by: obtaining at least a portion of the compressed video stream comprising a plurality of frame encodings representative of a temporal sequence of frames, the plurality of frame encodings comprising: an intra-coded frame (I-frame) encoding representative of an I-frame; and a first inter frame encoding representative of a first inter frame subsequent to the I-frame in the temporal sequence, comprising: motion information of the first inter frame relative to the I-frame; and residual information of the first inter frame relative to the I-frame; decoding the I-frame encoding to generate the I-frame; decoding the first inter frame encoding to generate: the first inter frame; the motion information of the first inter frame; and the res

Assignees

Inventors

Classifications

  • H04N19/51Primary

    Motion estimation or motion compensation · CPC title

  • Incoming video signal characteristics or properties · CPC title

  • the unit being bits, e.g. of the compressed video stream · CPC title

  • G06T3/4053Primary

    based on super-resolution, i.e. the output image resolution being higher than the sensor resolution · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11778223B2 cover?
A method, device and computer-readable medium for generating a super-resolution version of a compressed video stream. By leveraging the motion information and residual information in compressed video streams, described examples are able to skip the time-consuming motion-estimation step for most frames and make the most use of the SR results of key frames. A key frame SR module generates SR vers…
Who is the assignee on this patent?
Liu Wentao, Yu Yuanhao, Wang Yang, and 4 more
What technology area does this patent fall under?
Primary CPC classification H04N19/51. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Oct 03 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).