Depth-guided video inpainting for autonomous driving

US11282164B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11282164-B2
Application numberUS-202016770904-A
CountryUS
Kind codeB2
Filing dateMay 26, 2020
Priority dateMay 26, 2020
Publication dateMar 22, 2022
Grant dateMar 22, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods of video inpainting for autonomous driving are disclosed. For example, the method stitches a multiplicity of depth frames into a 3D map, where one or more objects in the depth frames have previously been removed. The method further projects the 3D map onto a first image frame to generate a corresponding depth map, where the first image frame includes a target inpainting region. For each target pixel within the target inpainting region of the first image frame, based on the corresponding depth map, the method further maps the target pixel within the target inpainting region of the first image frame to a candidate pixel in a second image frame. The method further determines a candidate color to fill the target pixel. The method further performs Poisson image editing on the first image frame to achieve color consistency at a boundary and between inside and outside of the target inpainting region of the first image frame. For each pixel in the target inpainting region of the first image frame, the method further traces the pixel into neighboring frames and replacing an original color of the pixel with an average of colors sampled from the neighboring frames.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method of video inpainting, comprising: receiving a plurality of sensor data sets comprising depth frames and image frames; for each depth frame, removing one or more objects from the depth frame thereby producing a plurality of resulting depth frames without the one or more objects; stitching the plurality of resulting depth frames into a three-dimensional (3D) map; refining a camera pose of a first image frame having a target inpainting region; projecting the 3D map onto the first image frame to generate a corresponding depth map; and for each target pixel within the target inpainting region of the first image frame, based on the corresponding depth map, mapping the target pixel within the target inpainting region of the first image frame to a candidate pixel in a second image frame included in the image frames, and determining a candidate color to fill the target pixel; wherein determining the candidate color to fill the target pixel comprises warping neighboring pixels around the candidate pixel into the second image frame by depth values of the neighboring pixels to sample expected colors to fill the target pixel; wherein warping the neighboring pixels around the candidate pixel into the second image frame comprises: computing an energy function based on a set of pixels in the target inpainting region and a set of labels corresponding to indices of candidate colors in a color space, and incorporating boundary smoothness constraint into a data cost based on respective expected colors of the neighboring pixels. 2. The method of claim 1 , further comprising performing Poisson image editing on the first image frame to achieve color consistency between inside and outside of the target inpainting region of the first image frame. 3. The method of claim 1 , wherein each depth frame includes point clouds representing one or more objects and a background of a scene. 4. The method of claim 2 , further comprising for each pixel in the target inpainting region of the first image frame, tracing the pixel into neighboring frames and replacing an original color of the pixel with an average of colors sampled from the neighboring frames. 5. The method of claim 1 , wherein the second image frame is temporally close to the first image frame. 6. The method of claim 1 , wherein the second image frame is a previous frame from the first image frame or a subsequent frame from the first image frame. 7. The method of claim 2 , wherein performing Poisson image editing on the first image frame comprises calculating a minimization function as follows: min f ⁢ ∫ ∫ Ω ⁢  Δ ⁢ ⁢ f - v  ⁢ ⁢ with ⁢ ⁢ f ⁢ ❘ ∂ Ω = f * ⁢ ❘ ∂ Ω , wherein Ω is the target inpainting region with boundary ∂Ω,f* is a color function of the first image frame, f is a color function of the target inpainting region within the first image frame, Δ.=[∂./∂x, ∂./∂y] is a gradient operator, and v is a desired color gradient defined over Ω. 8. A non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations, the operations comprising: receiving a plurality of sensor data sets comprising depth frames and image frames; for each depth frame, removing one or more objects from the depth frame thereby producing a plurality of resulting depth frames without the one or more objects; stitching the plurality of resulting depth frames into a three-dimensional (3D) map; refining a camera pose of a first image frame having a target inpainting region; projecting the 3D map onto the first image frame to generate a corresponding depth map; and for each target pixel within the target inpainting region of the first image frame, based on the corresponding depth map, mapping the target pixel within the target inpainting region of the first image frame to a candidate pixel in a second image frame included in the image frames, and determining a candidate color to fill the target pixel; wherein determining the candidate color to fill the target pixel comprises warping neighboring pixels around the candidate pixel into the second image frame by depth values of the neighboring pixels to sample expected colors to fill the target pixel; wherein warping the neighboring pixels around the candidate pixel into the second image frame comprises: computing an energy function based on a set of pixels in the target inpainting region and a set of labels corresponding to indices of candidate colors in a color space, and incorporating boundary smoothness constraint into a data cost based on respective expected colors of the neighboring pixels. 9. The non-transitory machine-readable medium of claim 8 , wherein the operations further comprise performing Poisson image editing on the first image frame to achieve color consistency between inside and outside of the target inpainting region of the first image frame. 10. The non-transitory machine-readable medium of claim 8 , wherein each depth frame includes point clouds representing one or more objects and a background of a scene. 11. The non-transitory machine-readable medium of claim 9 , wherein the operations further comprise for each pixel in the target inpainting region of the first image frame, tracing the pixel into neighboring frames and replacing an original color of the pixel with an average of colors sampled from the neighboring frames. 12. The non-transitory machine-readable medium of claim 8 , wherein the second image frame is temporally close to the first image frame. 13. The non-transitory machine-readable medium of claim 8 , wherein the second image frame is a previous frame fr

Assignees

Inventors

Classifications

  • Image mosaicing, e.g. composing plane images from plane sub-images · CPC title

  • Traffic on road, railway or crossing · CPC title

  • from multiple images · CPC title

  • Artificial neural networks [ANN] · CPC title

  • Range image; Depth image; 3D point clouds · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11282164B2 cover?
Systems and methods of video inpainting for autonomous driving are disclosed. For example, the method stitches a multiplicity of depth frames into a 3D map, where one or more objects in the depth frames have previously been removed. The method further projects the 3D map onto a first image frame to generate a corresponding depth map, where the first image frame includes a target inpainting regi…
Who is the assignee on this patent?
Baidu Usa Llc, Baidu Com Times Tech Beijing Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06T3/0093. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 22 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).