What technology area does this patent fall under?

Primary CPC classification H04N7/0127. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Dec 31 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and apparatus for generating video intermediate frame

US12185023B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12185023-B2
Application number	US-202318206459-A
Country	US
Kind code	B2
Filing date	Jun 6, 2023
Priority date	Jun 14, 2022
Publication date	Dec 31, 2024
Grant date	Dec 31, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for generating a video intermediate frame, including obtaining a target video frame pair; constructing an image pyramid for each video frame in the target video frame pair; and generating an intermediate frame of the target video frame pair by using a bidirectional optical flow estimation model and a pixel synthesis model in a layer-by-layer recursive calling manner according to an order of the image pyramid from a high layer to a low layer based on the image pyramid, wherein the generating of the intermediate frame of the target video frame pair comprising: repairing a bidirectional optical flow corresponding to a previous layer using the bidirectional optical flow estimation model, and repairing a previous intermediate frame corresponding to the previous layer using the pixel synthesis model.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for generating a video intermediate frame, comprising: obtaining a target video frame pair; constructing an image pyramid for each video frame in the target video frame pair; and generating an intermediate frame of the target video frame pair by using a bidirectional optical flow estimation model and a pixel synthesis model in a layer-by-layer recursive calling manner according to an order of the image pyramid from a high layer to a low layer based on the image pyramid, wherein the generating of the intermediate frame of the target video frame pair comprises: repairing a bidirectional optical flow corresponding to a previous layer using the bidirectional optical flow estimation model, and repairing a previous intermediate frame corresponding to the previous layer using the pixel synthesis model. 2. The method of claim 1 , wherein the generating of the intermediate frame of the target video frame pair further comprising: generating a first number of pixel-level feature maps having different resolutions for an image of a current layer in each image pyramid using a feature coding network, in order to provide the pixel-level feature maps to the bidirectional optical flow estimation model and the pixel synthesis model. 3. The method of claim 2 , wherein the first number is greater than or equal to 3, wherein the feature coding network comprises a convolutional network having at least a second number of down samplings, and wherein the second number is equal to the first number minus one. 4. The method of claim 1 , wherein the repairing of bidirectional optical flow corresponding to the previous layer comprising: inputting a pixel-level feature map corresponding to an image of a current layer and the bidirectional optical flow corresponding to the previous layer into the bidirectional optical flow estimation model, wherein the pixel-level feature map comprises a feature map output by convolution of a last layer of a feature coding network as a result of the image of the current layer being input to the feature coding network, and wherein the bidirectional optical flow comprises an optical flow from each video frame to the intermediate frame. 5. The method of claim 4 , wherein the repairing of the bidirectional optical flow comprising: linearly weighting the bidirectional optical flow corresponding to the previous layer to obtain an initial estimation value of a bidirectional optical flow corresponding to the current layer; based on the initial estimation value, performing forward-warping on the pixel-level feature map corresponding to each image of the current layer using a forward-warping layer of the bidirectional optical flow estimation model; based on a forward-warped feature map obtained by the forward-warping, constructing a partial cost volume using a cost volume layer of the bidirectional optical flow estimation model; performing channel stacking based on the initial estimation value, the forward-warped feature map, the partial cost volume, and a convolutional neural network (CNN) feature of the bidirectional optical flow corresponding to the previous layer; inputting a result of the channel stacking into an optical flow estimation layer of the bidirectional optical flow estimation model; and performing optical flow estimation to obtain a bidirectional optical flow repairing result corresponding to the current layer. 6. The method of claim 1 , wherein repairing the previous intermediate frame comprises: linearly weighting the repaired bidirectional optical flow; for each video frame, performing forward-warping for an image of a current layer in the video frame and a context feature of the image using a forward-warping layer of the pixel synthesis model based on the linearly weighted optical flow corresponding to the video frame, wherein the context feature includes a feature map output by a feature coding network before each down sampling and a feature map output by convolution of a last layer after the image of the current layer in the video frame is input to the feature coding network for processing; and inputting a result of the forward-warping and the previous intermediate frame to a pixel synthesis network of the pixel synthesis model to obtain an intermediate frame repairing result corresponding to the current layer. 7. The method of claim 1 , further comprising: after the intermediate frame is obtained based on an image of the lowest layer in the image pyramid, outputting the bidirectional optical flow. 8. The method of claim 2 , wherein the feature coding network is shared by the bidirectional optical flow estimation model and the pixel synthesis model. 9. The method of claim 1 , wherein the generated video intermediate frame is used for single-frame video frame interpolation or multi-frame video frame interpolation. 10. An apparatus for generating a video intermediate frame, comprising: at least one processor; and a memory configured to store instructions which, when executed by the at least one processor, cause the at least one processor to: obtain a target video frame pair; construct an image pyramid for each video frame in the target video frame pair; and generate an intermediate frame of the target video frame pair by using a bidirectional optical flow estimation model and a pixel synthesis model in a layer-by-layer recursive calling manner according to an order of the image pyramid from a high layer to a low layer based on pyramid, wherein the at least one processor configured, when generating the intermediate frame of the target video frame pair, to: repairing a bidirectional optical flow corresponding to a previous layer by using the bidirectional optical flow estimation model, and repairing a previous intermediate frame corresponding to the previous layer by using the pixel synthesis model. 11. The apparatus of claim 10 , wherein the at least one processor further configured, when generating the intermediate frame of the target video frame pair, to: generate a first number of pixel-level feature maps having different resolutions for an image of a current layer in each image pyramid using a feature coding network, in order to provide the pixel-level feature maps to the bidirectional optical flow estimation model and the pixel synthesis model. 12. The apparatus of claim 11 , wherein the first number is greater than or equal to 3, wherein the feature coding network comprises a convolutional network having at least a second number of down samplings, and wherein the second number is equal to the first number minus one. 13. The apparatus of claim 10 , wherein the at least one processor configured, when repairing bidirectional optical flow corresponding to the previous layer, to: input a pixel-level feature map corresponding to an image of a current layer and the bidirectional optical flow corresponding to the previous layer into the bidirectional optical flow estimation model, wherein the pixel-level feature map comprises a feature map output by convolution of a last layer of a feature coding network as a result of the image of the current layer being input to the feature coding network, and wherein the bidirectional optical flow comprises an optical flow from each video frame to the intermediate frame. 14. The apparatus of claim 13 , wherein the at least one processor further configured, when repairing the bidirectional optical flow, to: linearly weigh the bidirectional optical flow corresponding to the previous layer to obtain an initial estimation value of a bidirectional optical flow corresponding to the current layer; base

Assignees

Samsung Electronics Co Ltd

Inventors

Classifications

H04N7/0127Primary
by changing the field or frame frequency of the incoming video signal, e.g. frame rate converter · CPC title
G06T2207/20084
Artificial neural networks [ANN] · CPC title
G06T2207/10016
Video; Image sequence · CPC title
G06T7/246Primary
using feature-based methods, e.g. the tracking of corners or segments · CPC title
G06N3/045
Combinations of networks · CPC title

Patent family

Related publications grouped by family.

View patent family 83200833

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12185023B2 cover?: A method for generating a video intermediate frame, including obtaining a target video frame pair; constructing an image pyramid for each video frame in the target video frame pair; and generating an intermediate frame of the target video frame pair by using a bidirectional optical flow estimation model and a pixel synthesis model in a layer-by-layer recursive calling manner according to an ord…
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification H04N7/0127. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Dec 31 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method, apparatus, and device for video frame interpolation

Feature pyramid warping for video frame interpolation

Frame interpolation with multi-scale deep loss functions and generative adversarial networks

Systems and methods for multi-frame video frame interpolation

Optical flow tracking device and method

System and method for optical flow estimation

Generating synthetic video frames using optical flow

Frequently asked questions