Deep convolutional neural networks for crack detection from image data
US-2019147283-A1 · May 16, 2019 · US
US11468318B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11468318-B2 |
| Application number | US-201816495029-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 16, 2018 |
| Priority date | Mar 17, 2017 |
| Publication date | Oct 11, 2022 |
| Grant date | Oct 11, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, methods, and computer-readable media for context-aware synthesis for video frame interpolation are provided. A convolutional neural network (ConvNet) may, given two input video or image frames, interpolate a frame temporarily in the middle of the two input frames by combining motion estimation and pixel synthesis into a single step and formulating pixel interpolation as a local convolution over patches in the input images. The ConvNet may estimate a convolution kernel based on a first receptive field patch of a first input image frame and a second receptive field patch of a second input image frame. The ConvNet may then convolve the convolutional kernel over a first pixel patch of the first input image frame and a second pixel patch of the second input image frame to obtain color data of an output pixel of the interpolation frame. Other embodiments may be described and/or claimed.
Opening claim text (preview).
The invention claimed is: 1. A computer system comprising: processor circuitry communicatively coupled with memory circuitry, the memory circuitry to store program code of a convolutional neural network (ConvNet) and the processor circuitry is to operate the ConvNet to: obtain, as an input, a first image frame and a second image frame; estimate a pair of spatially-adaptive convolutional kernels to generate an individual output pixel based on a first receptive field patch of the first image frame and a second receptive field patch of the second image frame, wherein the estimation of the pair of spatially-adaptive convolutional kernels includes generation of a pair of kernel matrices, the pair of kernel matrices including a first kernel matrix for a first pixel patch of the first image frame and a second kernel matrix for a second pixel patch of the second image frame; convolve the pair of spatially-adaptive convolutional kernels over the first pixel patch of the first image frame and the second pixel patch of the second image frame to obtain a color of the individual output pixel; and generate and output an interpolation frame with the individual output pixel having the obtained color. 2. The computer system of claim 1 , wherein the processor circuitry is to operate the ConvNet to: produce the output pixel in the interpolation frame co-centered at same locations as the first receptive field patch in the first input image and the second receptive field patch in the second input image. 3. The computer system of claim 2 , wherein the first receptive field patch is centered around a pixel coordinate of the individual output pixel in the first image frame, and the second receptive field patch is centered around the pixel coordinate of the individual output pixel in the second image frame, and wherein the first pixel patch is centered within the first receptive field patch and the second pixel patch is centered within the second receptive field patch. 4. The computer system of claim 1 , wherein the ConvNet comprises: an input layer comprising raw pixel data of a plurality of input image frames, wherein the first image frame and the second image frame are among the plurality of input image frames; a plurality of convolutional layers comprising a corresponding one of a plurality of estimated kernels; a plurality of down-convolutional layers instead of one or more max-pooling layers, wherein individual down-convolutional layers of the plurality of down-convolutional layers are disposed between two convolutional layers of the plurality of convolutional layers; and an output layer comprising a feature map, wherein the feature map is a data structure that is representative of output pixels and corresponding obtained colors of the output pixels. 5. The computer system of claim 1 , wherein the ConvNet comprises: a contracting component comprising a first plurality of convolution layers and a plurality of pooling layers, wherein one or more convolution layers of the first plurality of convolution layers are grouped with a corresponding one of the plurality of pooling layers; an expanding component comprises a second plurality of convolution layers and a plurality of upsampling layers, wherein one or more convolution layers of the second plurality of convolution layers are grouped with a corresponding one of the plurality of upsampling layers; and a plurality of subnetworks, wherein each subnetwork of the plurality of subnetworks comprises a set of convolution layers and an upsampling layer. 6. The computer system of claim 5 , wherein the processor circuitry is to operate the ConvNet to: operate each subnetwork to estimate a corresponding one dimensional kernel for each pixel in the interpolation frame, wherein each of the corresponding one dimensional kernels is part of a pair of one dimensional kernels, and each pair of one dimensional kernels is used to compute a two dimensional kernel. 7. The computer system of claim 5 , wherein the processor circuitry is to operate the ConvNet to: operate the contracting component to extract features from the first and second image frames; and operate the expanding component to perform dense predictions on the extracted features. 8. The computer system of claim 5 , wherein the processor circuitry is to: operate each of the plurality of upsampling layers to perform a corresponding transposed convolution operation, a sub-pixel convolution operation, a nearest-neighbor operation, or a bilinear interpolation operation; and operate each of the plurality of pooling layers to perform a downsampling operation. 9. The computer system of claim 1 , wherein: each of the first kernel matrix and the second kernel matrix include a set of non-zero matrix values, locations of the non-zero matrix values indicate a motion, and the non-zero values are interpolation coefficients to combine pixel colors of the first and second pixel patches to generate the interpolation frame. 10. One or more non-transitory computer-readable media (NTCRM) including instructions of a convolutional neural network (ConvNet) wherein execution of the instructions by one or more processors is to cause a computer system to: obtain, as an input, a first image frame and a second image frame; estimate a spatially-adaptive convolutional kernel based on a first receptive field patch of the first image frame and a second receptive field patch of the second image frame, wherein, to estimate of the pair of spatially-adaptive convolutional kernels, execution of the instructions is to cause the computer system to generate a pair of kernel matrices, the pair of kernel matrices including a first kernel matrix for a first pixel patch of the first image frame and a second kernel matrix for a second pixel patch of the second image frame; convolve the pair of spatially-adaptive convolutional kernels over the first pixel patch of the first image frame and the second pixel patch of the second image frame to obtain a color of an output pixel for an interpolation frame; and generate and output the interpolation frame with the output pixel having the obtained color. 11. The one or more NTCRM of claim 10 , wherein execution of the instructions is to cause the computer system to: output of the output pixel in the interpolation frame co-centered at a same location as the first receptive field patch and the second receptive field patch in the first input image and the second input image, respectively. 12. The one or more NTCRM of claim 11 , wherein the first receptive field patch and the second receptive field patch are centered in the input image frame, and wherein the first pixel patch is centered within the first receptive field patch and the second pixel patch is centered within the second receptive field patch. 13. The one or more NTCRM of claim 10 , wherein the ConvNet comprises: an input layer comprising raw pixel data of a plurality of input image frames, wherein the first image frame and the second image frame are among the plurality of input image frames; a plurality of layers comprising a corresponding one of a plurality of convolutional layers, pooling layers, and/or Batch Normalization layers; a plurality of down-convolutional layers instead of one or more max-pooling layers, wherein the down-convolutional layers are disposed between some convolutional layers of the plurality of convolutional layers; and an output layer comprising a feature map comprising kernels that are used to produce the color of the output pixel. 14. The one or more NTCRM of claim 10 , wherein the ConvNet comprises: a contracting component comprising a first p
Auto-encoder networks; Encoder-decoder networks · CPC title
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence · CPC title
using neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.