Frame interpolation with multi-scale deep loss functions and generative adversarial networks
US-11122238-B1 · Sep 14, 2021 · US
US12469150B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12469150-B2 |
| Application number | US-202117995702-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 14, 2021 |
| Priority date | Apr 17, 2020 |
| Publication date | Nov 11, 2025 |
| Grant date | Nov 11, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are provided for optical flow estimation. In one embodiment, a method comprises estimating, with a first neural network, an optical flow between two image frames, wherein a resolution of the optical flow is lower than a resolution of the two image frames, and upsampling, with a second neural network, the optical flow to the resolution of the two image frames. In this way, the speed of optical flow estimation may be improved by reducing the amount of pixels being processed by a deep neural network, while the use of another deep neural network for guided upsampling of the optical flow estimate helps maintain the accuracy of the final output.
Opening claim text (preview).
The invention claimed is: 1 . A method comprising: estimating, with a first neural network, an optical flow between two image frames, wherein a resolution of the optical flow is lower than a resolution of the two image frames; upsampling, with a second neural network, the optical flow to the resolution of the two image frames; and acquiring the two image frames at the resolution of the two image frames, downsampling the two image frames to a lower resolution, and inputting the two downsampled image frames to the first neural network to estimate the optical flow, wherein the resolution of the optical flow estimated by the first neural network is lower than the lower resolution of the two downsampled image frames, the method further comprising performing downscale shuffling of the two image frames to obtain downshuffled image frames with a spatial resolution equal to the resolution of the optical flow estimated by the first neural network. 2 . The method of claim 1 , further comprising inputting the optical flow estimated by the first neural network and the downshuffled image frames to the second neural network to upsample the optical flow to the resolution of the two image frames. 3 . The method of claim 2 , further comprising correlating, with the second neural network, features of the optical flow estimated by the first neural network with high-frequency information of the downshuffled image frames. 4 . The method of claim 3 , further comprising learning, with a plurality of residual dense blocks of the second neural network, local and global features of the correlated features, and fusing the learned local and global features into fused features. 5 . The method of claim 4 , wherein upsampling the optical flow comprises iteratively upsampling the fused features to the resolution of the two image frames. 6 . A method, comprising: receiving two input images at a first resolution; downsampling the two input images to a second resolution lower than the first resolution; generating, with a first neural network, a low-resolution optical flow at a third resolution for the two downsampled input images, the third resolution lower than the second resolution; downscale shuffling the two input images to generate downshuffled images with a spatial resolution equal to the third resolution; generating, with a second neural network, an optical flow at the first resolution based on the low-resolution optical flow and the downshuffled images; and outputting the optical flow. 7 . The method of claim 6 , further comprising inputting the low-resolution optical flow and the downshuffled images to the second neural network, and extracting, with the second neural network, features corresponding to shallow correlations between the low-resolution optical flow and the downshuffled images. 8 . The method of claim 7 , further comprising extracting, with a plurality of residual dense blocks of the second neural network, additional features from the extracted features, and densely fusing the additional extracted features with extracted features. 9 . The method of claim 8 , wherein generating the optical flow at the first resolution comprises iteratively upsampling the densely fused features to obtain the optical flow at the first resolution. 10 . The method of claim 9 , further comprising iteratively upsampling the densely fused features with a step size of two to obtain the optical flow at the first resolution. 11 . The method of claim 6 , further comprising pre-training the first neural network with input images at the first resolution. 12 . A system comprising: a video source configured to acquire video comprising a sequence of image frames; and a computing device communicatively coupled to the video source and configured with instructions stored in non-transitory memory that when executed cause the computing device to: estimate, with a first neural network, an optical flow between two consecutive image frames in the sequence of image frames, wherein a resolution of the optical flow is lower than a resolution of the two consecutive image frames; and upsample, with a second neural network, the optical flow to the resolution of the two image frames, wherein the computing device is further configured with instructions in the non-transitory memory that when executed cause the computing device to acquire the two image frames at the resolution of the two image frames, downsample the two image frames to a lower resolution, and input the two downsampled image frames to the first neural network to estimate the optical flow, and wherein the resolution of the optical flow estimated by the first neural network is lower than the lower resolution of the two downsampled image frames, and wherein the computing device is further configured with instructions in the non-transitory memory that when executed cause the computing device to perform downscale shuffling of the two image frames to obtain downshuffled image frames with a spatial resolution equal to the resolution of the optical flow estimated by the first neural network. 13 . The system of claim 12 , wherein the computing device is further configured with instructions in the non-transitory memory that when executed cause the computing device to input the optical flow estimated by the first neural network and the downshuffled image frames to the second neural network to upsample the optical flow to the resolution of the two image frames. 14 . The system of claim 13 , wherein the computing device is further configured with instructions in the non-transitory memory that when executed cause the computing device to correlate, with the second neural network, features of the optical flow estimated by the first neural network with high-frequency information of the downshuffled image frames. 15 . The system of claim 14 , wherein the computing device is further configured with instructions in the non-transitory memory that when executed cause the computing device to learn, with a plurality of residual dense blocks of the second neural network, local and global features of the correlated features, and fuse, with the second neural network, the learned local and global features into fused features. 16 . The system of claim 15 , wherein the computing device is further configured with instructions in the non-transitory memory that when executed cause the computing device to upsample the optical flow by iteratively upsampling the fused features to the resolution of the two image frames.
Artificial neural networks [ANN] · CPC title
Training; Learning · CPC title
in video content (extracting overlay text G06V20/62; video retrieval G06F16/70; processing of video elementary streams in video servers H04N21/234; processing of video elementary streams in video clients H04N21/44) · CPC title
Motion-based segmentation · CPC title
using gradient-based methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.