Method, apparatus, and device for video frame interpolation

US11354541B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11354541-B2
Application numberUS-201916626409-A
CountryUS
Kind codeB2
Filing dateMar 7, 2019
Priority dateMar 1, 2019
Publication dateJun 7, 2022
Grant dateJun 7, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present specification discloses a method, apparatus, and device for video frame interpolation. The method of embodiment of the present specification comprises: acquiring a video frame training sample, wherein the video frame training sample includes an even number of consecutive video frames and a first key frame, and the first key frame is an intermediate frame of the even number of consecutive video frames; constructing a pyramid deep learning model, wherein each level of the pyramid deep learning model being used to generate intermediate frames of different resolutions has a plurality of convolutional neural network layers; inputting the even number of consecutive video frames to the pyramid deep learning model to generate a second key frame; modifying the pyramid deep learning model according to the second key frame and the first key frame to generate a modified pyramid deep learning model; inputting a plurality of video frames to be processed into the modified pyramid deep learning model to generate an intermediate frame of the plurality of video frames. The invention fully exploits the spatio-temporal domain information between multi-frame video frames, and adopts a pyramid refinement strategy to effectively estimate the motion information and the occlusion region, thereby greatly improving the quality of the intermediate frame.

First claim

Opening claim text (preview).

We claim: 1. A method for video frame interpolation, comprising: acquiring a video frame training sample, wherein the video frame training sample includes an even number of consecutive video frames and a first key frame, and the first key frame is an intermediate frame of the even number of consecutive video frames; constructing a pyramid deep learning model, wherein each level of the pyramid deep learning model being used to generate intermediate frames of different resolutions has a plurality of convolutional neural network layers; from a lower level to an upper level, the resolution is gradually increased, and video frame parameters of a lower level resolution are used for the calculation of the intermediate frame of a higher resolution; inputting the even number of consecutive video frames to the pyramid deep learning model to generate a second key frame; modifying the pyramid deep learning model according to the second key frame and the first key frame to generate a modified pyramid deep learning model; inputting a plurality of video frames to be processed into the modified pyramid deep learning model to generate an intermediate frame of the plurality of video frames. 2. The method according to claim 1 , the modifying the pyramid deep learning model according to the second key frame and the first key frame comprises: extracting a first characteristic parameter of the first key frame; extracting a second characteristic parameter of the second key frame; generating a difference result between the first key frame and the second key frame according to the first feature parameter and the second feature parameter; adjusting weight parameters of the pyramid deep learning model according to the difference result. 3. A method for video frame interpolation, comprising: acquiring a video frame training sample, wherein the video frame training sample includes an even number of consecutive video frames and a first key frame, and the first key frame is an intermediate frame of the even number of consecutive video frames; constructing a pyramid deep learning model, wherein each level of the pyramid deep learning model being used to generate intermediate frames of different resolutions has a plurality of convolutional neural network layers; inputting the even number of consecutive video frames to the pyramid deep learning model to generate a second key frame; modifying the pyramid deep learning model according to the second key frame and the first key frame to generate a modified pyramid deep learning model; inputting a plurality of video frames to be processed into the modified pyramid deep learning model to generate an intermediate frame of the plurality of video frames; wherein the inputting the even number of consecutive video frames to the pyramid deep learning model comprises: determining a first resolution of a video frame inputted to the first level of the pyramid deep learning model according to a preset rule; processing the even number of consecutive video frames according to the first resolution; inputting the processed even number of consecutive video frames to the first level of the pyramid deep learning model to generate an optical flow set and an occlusion mask set of the intermediate frame to each video frame of the processed even number of consecutive video frames; generating a calculated intermediate frame of the first level according to the optical flow set and the occlusion mask set; modifying parameters of the first level of the pyramid deep learning model according to the calculated intermediate frame of the first level and the real intermediate frame with the resolution of the first level. 4. A method for video frame interpolation, comprising: acquiring a video frame training sample, wherein the video frame training sample includes an even number of consecutive video frames and a first key frame, and the first key frame is an intermediate frame of the even number of consecutive video frames; constructing a pyramid deep learning model, wherein each level of the pyramid deep learning model being used to generate intermediate frames of different resolutions has a plurality of convolutional neural network layers; inputting the even number of consecutive video frames to the pyramid deep learning model to generate a second key frame; modifying the pyramid deep learning model according to the second key frame and the first key frame to generate a modified pyramid deep learning model; inputting a plurality of video frames to be processed into the modified pyramid deep learning model to generate an intermediate frame of the plurality of video frames; wherein the inputting the even number of consecutive video frames to the pyramid deep learning model comprises: determining a second resolution of the video frame inputted to the K-th level of the pyramid deep learning model according to a preset rule, wherein a resolution of the video frame inputted to the K-th level is higher than a resolution of a video frame inputted to the (K-1)th level, the resolution of the last inputted video frame of the pyramid deep learning model is the original resolution of the even number of consecutive video frames, and K is a natural number greater than or equal to 2; processing the even number of consecutive video frames according to the second resolution to generate a video frame inputted to the K-th level; interpolation of each optical stream in the optical flow set generated by the (K-1)th level by upsampling by 2 times to generate a first optical flow set; processing the video frame inputted to the K-th level by using each optical flow in the first optical flow set to generate a first warped image set; generating a residual flow set and a occlusion mask set of the K-th level according to the first optical flow set and the first warped image set; generating an optical flow set of the K-th level according to the first optical flow set and the residual flow set; generating a calculated intermediate frame of the K-th level according to the optical flow set of the K-th level and the occlusion mask set of the K-th level; modifying parameters of the first level to the K-th level of the pyramid deep learning model according to the calculated intermediate frame of the K-th level and the real intermediate frame with the resolution of the K-th level. 5. The method according to claim 4 , the generating a calculated intermediate frame of the K-th level according to the optical flow set of the K-th level and the occlusion mask set of the K-th level comprises: generating a second warped image set through warping the inputted video frames by optical flow set at the K-th level; generating a calculated intermediate frame of the K-th level according to the second warped image set and the occlusion mask set of the K-th level. 6. The method according to claim 5 , the generating a calculated intermediate frame of the K-th level according to the second warped image set and the occlusion mask set of the K-th level comprises: the calculated intermediate frame of the K-th level is calculated by the following formula: I t , k = ∑ i = 1 4 ⁢ M k , i ⊗

Assignees

Inventors

Classifications

  • G06T3/4007Primary

    based on interpolation, e.g. bilinear interpolation (image demosaicing G06T3/4015; edge-driven or edge-based scaling G06T3/403) · CPC title

  • characterised by the process organisation or structure, e.g. boosting cascade · CPC title

  • based on criteria of topology preservation, e.g. multidimensional scaling or self-organising maps · CPC title

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • Activation functions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11354541B2 cover?
The present specification discloses a method, apparatus, and device for video frame interpolation. The method of embodiment of the present specification comprises: acquiring a video frame training sample, wherein the video frame training sample includes an even number of consecutive video frames and a first key frame, and the first key frame is an intermediate frame of the even number of consec…
Who is the assignee on this patent?
Univ Peking Shenzhen Graduate School
What technology area does this patent fall under?
Primary CPC classification G06T3/4007. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 07 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).