Method and apparatus for depth estimation of monocular image, and storage medium

US11443445B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11443445-B2
Application numberUS-202016830363-A
CountryUS
Kind codeB2
Filing dateMar 26, 2020
Priority dateJul 27, 2018
Publication dateSep 13, 2022
Grant dateSep 13, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and apparatus for depth estimation of a monocular image, and a storage medium are provided. The method includes: obtaining, through a depth estimation neural network, a global feature of a monocular image according to absolute features of preset regions and relative features among the preset regions in the monocular image; and obtaining a predicted depth map of the monocular image according to the global feature, and the absolute features of preset regions and relative features among the preset regions in the monocular image.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method for depth estimation of a monocular image, comprising: obtaining, through a depth estimation neural network, a global feature of a monocular image according to absolute features of preset regions and relative features among the preset regions in the monocular image; and obtaining a predicted depth map of the monocular image according to the global feature, the absolute features of preset regions and the relative features among the preset regions in the monocular image. 2. The method according to claim 1 , wherein before the obtaining a global feature of a monocular image according to absolute features of preset regions and relative features among the preset regions in the monocular image, the method further comprises: performing, through a first neural network, feature extraction on the monocular image to obtain features of preset regions in the monocular image, and taking the features of the preset regions as the absolute features of the preset regions in the monocular image; and obtaining the relative features among the preset regions in the monocular image according to the absolute features of the preset regions in the monocular image. 3. The method according to claim 2 , wherein the obtaining relative features among the preset regions in the monocular image according to the absolute features of the preset regions in the monocular image comprises: performing, through an association layer, a vector operation on the absolute features of the preset regions in the monocular image to obtain the relative features among the preset regions in the monocular image. 4. The method according to claim 2 , wherein before performing, through a first neural network, feature extraction on the monocular image, the method further comprises: performing downsampling on the monocular image to obtain a monocular image having a preset dimension, wherein a dimension of the monocular image is a multiple of the preset dimension. 5. The method according to claim 1 , wherein the obtaining a global feature of a monocular image according to absolute features of preset regions and relative features among the preset regions in the monocular image comprises: obtaining, through a full connection layer, the global feature of the monocular image by combining the absolute features of the preset regions and the relative features among the preset regions in the monocular image. 6. The method according to claim 1 , wherein the obtaining a predicted depth map of the monocular image according to the global feature and the absolute features of preset regions and relative features among the preset regions in the monocular image comprises: performing, through a depth estimator, depth estimation according to the global feature, the absolute features of the preset regions and the relative features among the preset regions in the monocular image to obtain the predicted depth map of the monocular image. 7. The method according to claim 1 , wherein after the obtaining a predicted depth map of the monocular image according to the global feature, the absolute features of preset regions and relative features among the preset regions in the monocular image, the method further comprises: performing optimization on the predicted depth map according to a longitudinal variation law of depth information of the monocular image to obtain a target depth map of the monocular image. 8. The method according to claim 7 , wherein the performing optimization on the predicted depth map according to a longitudinal variation law of depth information of the monocular image to obtain a target depth map of the monocular image comprises: performing residual estimation on the predicted depth map according to the longitudinal variation law of depth information of the monocular image to obtain a residual plot of the predicted depth map; and performing optimization on the predicted depth map according to the residual plot to obtain the target depth map of the monocular image. 9. The method according to claim 8 , wherein the performing residual estimation on the predicted depth map according to a longitudinal variation law of depth information of the monocular image to obtain a residual plot of the predicted depth map comprises: performing, through a residual estimation network, residual estimation on the predicted depth map according to the longitudinal variation law of depth information of the monocular image to obtain a residual plot of the predicted depth map; and the performing optimization on the predicted depth map according to the residual plot to obtain a target depth map of the monocular image comprises performing a pixel-by-pixel superposition operation on the residual plot and the predicted depth map to obtain the target depth map of the monocular image. 10. The method according to claim 7 , wherein before the performing optimization on the predicted depth map according to a longitudinal variation law of depth information of the monocular image to obtain a target depth map of the monocular image, the method further comprises: obtaining the longitudinal variation law of depth information of the monocular image according to the predicted depth map. 11. The method according to claim 10 , wherein the obtaining a longitudinal variation law of depth information of the monocular image according to the predicted depth map comprises: performing, through a longitudinal pooling layer, processing on the predicted depth map to obtain the longitudinal variation law of depth information of the monocular image. 12. The method according to claim 7 , wherein the performing optimization on the predicted depth map according to a longitudinal variation law of depth information of the monocular image comprises: performing a preset number of upsamplings on the predicted depth map, obtaining the longitudinal variation law of depth information according to a predicted depth map, obtained by each upsampling, having a dimension sequentially increased by a multiple, and performing optimization on the predicted depth map, obtained by each upsampling, having a dimension sequentially increased by a multiple according to the obtained longitudinal variation law of the depth information to obtain an optimized target depth map; wherein the optimized target depth map obtained by each of the upsamplings other than a last upsampling is taken as a predicted depth map of a next upsampling, the optimized target depth map obtained by the last upsampling is taken as the target depth map of the monocular image, and the target depth map has the same dimension as the monocular image. 13. The method according to claim 1 , wherein the depth estimation neural network comprises an association layer, a full connection layer, and a depth estimator, and is obtained by training the depth estimation neural network by using a sparse depth map and a dense depth map obtained by stereo matching for binocular images as marking data. 14. An apparatus for the depth estimation of a monocular image, comprising: one or more processors; and a memory, configured to storing instructions executable by the one or more processors, wherein the one or more processors are configured to: obtain, through a depth estimation neural network, a global feature of a monocular image according to absolute features of preset regions and relative features among the preset regions in the monocular image, and obtain a predicted depth map of the monocular image according to the global feature and the absolute features of preset regions and the relative features among the preset regions in the monocular image.

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11443445B2 cover?
A method and apparatus for depth estimation of a monocular image, and a storage medium are provided. The method includes: obtaining, through a depth estimation neural network, a global feature of a monocular image according to absolute features of preset regions and relative features among the preset regions in the monocular image; and obtaining a predicted depth map of the monocular image acco…
Who is the assignee on this patent?
Shenzhen Sensetime Technology Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06T7/50. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).