Method and apparatus for generating video intermediate frame

US12185023B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12185023-B2
Application numberUS-202318206459-A
CountryUS
Kind codeB2
Filing dateJun 6, 2023
Priority dateJun 14, 2022
Publication dateDec 31, 2024
Grant dateDec 31, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for generating a video intermediate frame, including obtaining a target video frame pair; constructing an image pyramid for each video frame in the target video frame pair; and generating an intermediate frame of the target video frame pair by using a bidirectional optical flow estimation model and a pixel synthesis model in a layer-by-layer recursive calling manner according to an order of the image pyramid from a high layer to a low layer based on the image pyramid, wherein the generating of the intermediate frame of the target video frame pair comprising: repairing a bidirectional optical flow corresponding to a previous layer using the bidirectional optical flow estimation model, and repairing a previous intermediate frame corresponding to the previous layer using the pixel synthesis model.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for generating a video intermediate frame, comprising: obtaining a target video frame pair; constructing an image pyramid for each video frame in the target video frame pair; and generating an intermediate frame of the target video frame pair by using a bidirectional optical flow estimation model and a pixel synthesis model in a layer-by-layer recursive calling manner according to an order of the image pyramid from a high layer to a low layer based on the image pyramid, wherein the generating of the intermediate frame of the target video frame pair comprises: repairing a bidirectional optical flow corresponding to a previous layer using the bidirectional optical flow estimation model, and repairing a previous intermediate frame corresponding to the previous layer using the pixel synthesis model. 2. The method of claim 1 , wherein the generating of the intermediate frame of the target video frame pair further comprising: generating a first number of pixel-level feature maps having different resolutions for an image of a current layer in each image pyramid using a feature coding network, in order to provide the pixel-level feature maps to the bidirectional optical flow estimation model and the pixel synthesis model. 3. The method of claim 2 , wherein the first number is greater than or equal to 3, wherein the feature coding network comprises a convolutional network having at least a second number of down samplings, and wherein the second number is equal to the first number minus one. 4. The method of claim 1 , wherein the repairing of bidirectional optical flow corresponding to the previous layer comprising: inputting a pixel-level feature map corresponding to an image of a current layer and the bidirectional optical flow corresponding to the previous layer into the bidirectional optical flow estimation model, wherein the pixel-level feature map comprises a feature map output by convolution of a last layer of a feature coding network as a result of the image of the current layer being input to the feature coding network, and wherein the bidirectional optical flow comprises an optical flow from each video frame to the intermediate frame. 5. The method of claim 4 , wherein the repairing of the bidirectional optical flow comprising: linearly weighting the bidirectional optical flow corresponding to the previous layer to obtain an initial estimation value of a bidirectional optical flow corresponding to the current layer; based on the initial estimation value, performing forward-warping on the pixel-level feature map corresponding to each image of the current layer using a forward-warping layer of the bidirectional optical flow estimation model; based on a forward-warped feature map obtained by the forward-warping, constructing a partial cost volume using a cost volume layer of the bidirectional optical flow estimation model; performing channel stacking based on the initial estimation value, the forward-warped feature map, the partial cost volume, and a convolutional neural network (CNN) feature of the bidirectional optical flow corresponding to the previous layer; inputting a result of the channel stacking into an optical flow estimation layer of the bidirectional optical flow estimation model; and performing optical flow estimation to obtain a bidirectional optical flow repairing result corresponding to the current layer. 6. The method of claim 1 , wherein repairing the previous intermediate frame comprises: linearly weighting the repaired bidirectional optical flow; for each video frame, performing forward-warping for an image of a current layer in the video frame and a context feature of the image using a forward-warping layer of the pixel synthesis model based on the linearly weighted optical flow corresponding to the video frame, wherein the context feature includes a feature map output by a feature coding network before each down sampling and a feature map output by convolution of a last layer after the image of the current layer in the video frame is input to the feature coding network for processing; and inputting a result of the forward-warping and the previous intermediate frame to a pixel synthesis network of the pixel synthesis model to obtain an intermediate frame repairing result corresponding to the current layer. 7. The method of claim 1 , further comprising: after the intermediate frame is obtained based on an image of the lowest layer in the image pyramid, outputting the bidirectional optical flow. 8. The method of claim 2 , wherein the feature coding network is shared by the bidirectional optical flow estimation model and the pixel synthesis model. 9. The method of claim 1 , wherein the generated video intermediate frame is used for single-frame video frame interpolation or multi-frame video frame interpolation. 10. An apparatus for generating a video intermediate frame, comprising: at least one processor; and a memory configured to store instructions which, when executed by the at least one processor, cause the at least one processor to: obtain a target video frame pair; construct an image pyramid for each video frame in the target video frame pair; and generate an intermediate frame of the target video frame pair by using a bidirectional optical flow estimation model and a pixel synthesis model in a layer-by-layer recursive calling manner according to an order of the image pyramid from a high layer to a low layer based on pyramid, wherein the at least one processor configured, when generating the intermediate frame of the target video frame pair, to: repairing a bidirectional optical flow corresponding to a previous layer by using the bidirectional optical flow estimation model, and repairing a previous intermediate frame corresponding to the previous layer by using the pixel synthesis model. 11. The apparatus of claim 10 , wherein the at least one processor further configured, when generating the intermediate frame of the target video frame pair, to: generate a first number of pixel-level feature maps having different resolutions for an image of a current layer in each image pyramid using a feature coding network, in order to provide the pixel-level feature maps to the bidirectional optical flow estimation model and the pixel synthesis model. 12. The apparatus of claim 11 , wherein the first number is greater than or equal to 3, wherein the feature coding network comprises a convolutional network having at least a second number of down samplings, and wherein the second number is equal to the first number minus one. 13. The apparatus of claim 10 , wherein the at least one processor configured, when repairing bidirectional optical flow corresponding to the previous layer, to: input a pixel-level feature map corresponding to an image of a current layer and the bidirectional optical flow corresponding to the previous layer into the bidirectional optical flow estimation model, wherein the pixel-level feature map comprises a feature map output by convolution of a last layer of a feature coding network as a result of the image of the current layer being input to the feature coding network, and wherein the bidirectional optical flow comprises an optical flow from each video frame to the intermediate frame. 14. The apparatus of claim 13 , wherein the at least one processor further configured, when repairing the bidirectional optical flow, to: linearly weigh the bidirectional optical flow corresponding to the previous layer to obtain an initial estimation value of a bidirectional optical flow corresponding to the current layer; base

Assignees

Inventors

Classifications

  • H04N7/0127Primary

    by changing the field or frame frequency of the incoming video signal, e.g. frame rate converter · CPC title

  • Artificial neural networks [ANN] · CPC title

  • Video; Image sequence · CPC title

  • G06T7/246Primary

    using feature-based methods, e.g. the tracking of corners or segments · CPC title

  • Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12185023B2 cover?
A method for generating a video intermediate frame, including obtaining a target video frame pair; constructing an image pyramid for each video frame in the target video frame pair; and generating an intermediate frame of the target video frame pair by using a bidirectional optical flow estimation model and a pixel synthesis model in a layer-by-layer recursive calling manner according to an ord…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification H04N7/0127. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Dec 31 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).