What technology area does this patent fall under?

Primary CPC classification H04N7/0127. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Oct 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Frame interpolation via adaptive convolution and adaptive separable convolution

US11468318B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11468318-B2
Application number	US-201816495029-A
Country	US
Kind code	B2
Filing date	Mar 16, 2018
Priority date	Mar 17, 2017
Publication date	Oct 11, 2022
Grant date	Oct 11, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and computer-readable media for context-aware synthesis for video frame interpolation are provided. A convolutional neural network (ConvNet) may, given two input video or image frames, interpolate a frame temporarily in the middle of the two input frames by combining motion estimation and pixel synthesis into a single step and formulating pixel interpolation as a local convolution over patches in the input images. The ConvNet may estimate a convolution kernel based on a first receptive field patch of a first input image frame and a second receptive field patch of a second input image frame. The ConvNet may then convolve the convolutional kernel over a first pixel patch of the first input image frame and a second pixel patch of the second input image frame to obtain color data of an output pixel of the interpolation frame. Other embodiments may be described and/or claimed.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer system comprising: processor circuitry communicatively coupled with memory circuitry, the memory circuitry to store program code of a convolutional neural network (ConvNet) and the processor circuitry is to operate the ConvNet to: obtain, as an input, a first image frame and a second image frame; estimate a pair of spatially-adaptive convolutional kernels to generate an individual output pixel based on a first receptive field patch of the first image frame and a second receptive field patch of the second image frame, wherein the estimation of the pair of spatially-adaptive convolutional kernels includes generation of a pair of kernel matrices, the pair of kernel matrices including a first kernel matrix for a first pixel patch of the first image frame and a second kernel matrix for a second pixel patch of the second image frame; convolve the pair of spatially-adaptive convolutional kernels over the first pixel patch of the first image frame and the second pixel patch of the second image frame to obtain a color of the individual output pixel; and generate and output an interpolation frame with the individual output pixel having the obtained color. 2. The computer system of claim 1 , wherein the processor circuitry is to operate the ConvNet to: produce the output pixel in the interpolation frame co-centered at same locations as the first receptive field patch in the first input image and the second receptive field patch in the second input image. 3. The computer system of claim 2 , wherein the first receptive field patch is centered around a pixel coordinate of the individual output pixel in the first image frame, and the second receptive field patch is centered around the pixel coordinate of the individual output pixel in the second image frame, and wherein the first pixel patch is centered within the first receptive field patch and the second pixel patch is centered within the second receptive field patch. 4. The computer system of claim 1 , wherein the ConvNet comprises: an input layer comprising raw pixel data of a plurality of input image frames, wherein the first image frame and the second image frame are among the plurality of input image frames; a plurality of convolutional layers comprising a corresponding one of a plurality of estimated kernels; a plurality of down-convolutional layers instead of one or more max-pooling layers, wherein individual down-convolutional layers of the plurality of down-convolutional layers are disposed between two convolutional layers of the plurality of convolutional layers; and an output layer comprising a feature map, wherein the feature map is a data structure that is representative of output pixels and corresponding obtained colors of the output pixels. 5. The computer system of claim 1 , wherein the ConvNet comprises: a contracting component comprising a first plurality of convolution layers and a plurality of pooling layers, wherein one or more convolution layers of the first plurality of convolution layers are grouped with a corresponding one of the plurality of pooling layers; an expanding component comprises a second plurality of convolution layers and a plurality of upsampling layers, wherein one or more convolution layers of the second plurality of convolution layers are grouped with a corresponding one of the plurality of upsampling layers; and a plurality of subnetworks, wherein each subnetwork of the plurality of subnetworks comprises a set of convolution layers and an upsampling layer. 6. The computer system of claim 5 , wherein the processor circuitry is to operate the ConvNet to: operate each subnetwork to estimate a corresponding one dimensional kernel for each pixel in the interpolation frame, wherein each of the corresponding one dimensional kernels is part of a pair of one dimensional kernels, and each pair of one dimensional kernels is used to compute a two dimensional kernel. 7. The computer system of claim 5 , wherein the processor circuitry is to operate the ConvNet to: operate the contracting component to extract features from the first and second image frames; and operate the expanding component to perform dense predictions on the extracted features. 8. The computer system of claim 5 , wherein the processor circuitry is to: operate each of the plurality of upsampling layers to perform a corresponding transposed convolution operation, a sub-pixel convolution operation, a nearest-neighbor operation, or a bilinear interpolation operation; and operate each of the plurality of pooling layers to perform a downsampling operation. 9. The computer system of claim 1 , wherein: each of the first kernel matrix and the second kernel matrix include a set of non-zero matrix values, locations of the non-zero matrix values indicate a motion, and the non-zero values are interpolation coefficients to combine pixel colors of the first and second pixel patches to generate the interpolation frame. 10. One or more non-transitory computer-readable media (NTCRM) including instructions of a convolutional neural network (ConvNet) wherein execution of the instructions by one or more processors is to cause a computer system to: obtain, as an input, a first image frame and a second image frame; estimate a spatially-adaptive convolutional kernel based on a first receptive field patch of the first image frame and a second receptive field patch of the second image frame, wherein, to estimate of the pair of spatially-adaptive convolutional kernels, execution of the instructions is to cause the computer system to generate a pair of kernel matrices, the pair of kernel matrices including a first kernel matrix for a first pixel patch of the first image frame and a second kernel matrix for a second pixel patch of the second image frame; convolve the pair of spatially-adaptive convolutional kernels over the first pixel patch of the first image frame and the second pixel patch of the second image frame to obtain a color of an output pixel for an interpolation frame; and generate and output the interpolation frame with the output pixel having the obtained color. 11. The one or more NTCRM of claim 10 , wherein execution of the instructions is to cause the computer system to: output of the output pixel in the interpolation frame co-centered at a same location as the first receptive field patch and the second receptive field patch in the first input image and the second input image, respectively. 12. The one or more NTCRM of claim 11 , wherein the first receptive field patch and the second receptive field patch are centered in the input image frame, and wherein the first pixel patch is centered within the first receptive field patch and the second pixel patch is centered within the second receptive field patch. 13. The one or more NTCRM of claim 10 , wherein the ConvNet comprises: an input layer comprising raw pixel data of a plurality of input image frames, wherein the first image frame and the second image frame are among the plurality of input image frames; a plurality of layers comprising a corresponding one of a plurality of convolutional layers, pooling layers, and/or Batch Normalization layers; a plurality of down-convolutional layers instead of one or more max-pooling layers, wherein the down-convolutional layers are disposed between some convolutional layers of the plurality of convolutional layers; and an output layer comprising a feature map comprising kernels that are used to produce the color of the output pixel. 14. The one or more NTCRM of claim 10 , wherein the ConvNet comprises: a contracting component comprising a first p

Assignees

Univ Portland State

Inventors

Classifications

G06N3/0455
Auto-encoder networks; Encoder-decoder networks · CPC title
G06N3/09
Supervised learning · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
H04N19/587
involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence · CPC title
G06T3/4046
using neural networks · CPC title

Patent family

Related publications grouped by family.

View patent family 63522622

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11468318B2 cover?: Systems, methods, and computer-readable media for context-aware synthesis for video frame interpolation are provided. A convolutional neural network (ConvNet) may, given two input video or image frames, interpolate a frame temporarily in the middle of the two input frames by combining motion estimation and pixel synthesis into a single step and formulating pixel interpolation as a local convolu…
Who is the assignee on this patent?: Univ Portland State
What technology area does this patent fall under?: Primary CPC classification H04N7/0127. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Oct 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Deep convolutional neural networks for crack detection from image data

Information processing apparatus, information processing method, and program

Convolution operation apparatus

Three-dimensional (3d) convolution with 3d batch normalization

Frequently asked questions