Self-supervised depth estimation method and system
US-2021183083-A1 · Jun 17, 2021 · US
US11798180B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11798180-B2 |
| Application number | US-202117186436-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 26, 2021 |
| Priority date | Feb 26, 2021 |
| Publication date | Oct 24, 2023 |
| Grant date | Oct 24, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
This disclosure describes one or more implementations of a depth prediction system that generates accurate depth images from single input digital images. In one or more implementations, the depth prediction system enforces different sets of loss functions across mix-data sources to generate a multi-branch architecture depth prediction model. For instance, in one or more implementations, the depth prediction model utilizes different data sources having different granularities of ground truth depth data to robustly train a depth prediction model. Further, given the different ground truth depth data granularities from the different data sources, the depth prediction model enforces different combinations of loss functions including an image-level normalized regression loss function and/or a pair-wise normal loss among other loss functions.
Opening claim text (preview).
What is claimed is: 1. A method comprising: generating, utilizing a depth prediction machine-learning model, a first set of predicted depth images from a first set of ground truth digital images having a first source and a second set of predicted depth images from a second set of ground truth digital images having a second source; determining a first measure of loss from the first set of predicted depth images utilizing a first loss function by utilizing an image-level normalized regression loss function or a pair-wise normal loss function; determining a second measure of loss from the second set of predicted depth images utilizing a second loss function that differs from the first loss function; and tuning the depth prediction machine-learning model utilizing the first measure of loss and the second measure of loss. 2. The method of claim 1 , further comprising: generating the first set of predicted depth images by generating a set of depth maps via a first output path of the depth prediction machine-learning model; and generating the second set of predicted depth images by generating a set of inverse depth maps via a second output path of the depth prediction machine-learning model. 3. The method of claim 2 , wherein the first set of ground truth digital images comprises a first set of digital images and a set of ground truth depth maps, and wherein the second set of ground truth digital images comprises a second set of digital images and a set of ground truth inverse depth maps; and further comprising: determining the first measure of loss by comparing the set of depth maps to the set of ground truth depth maps from the first set of ground truth digital images utilizing the first loss function; and determining the second measure of loss by comparing the set of inverse depth maps to the set of ground truth inverse depth maps from the second set of ground truth digital images utilizing the second loss function. 4. The method of claim 1 , further comprising generating the first measure of loss utilizing the first loss function by utilizing an image-level normalized regression loss function that transforms the first set of ground truth digital images into a common depth space. 5. The method of claim 1 , further comprising generating the second measure of loss utilizing a set of loss functions comprising the image-level normalized regression loss function, the pair-wise normal loss function, and a multi-scale gradient loss function. 6. The method of claim 1 , wherein the first set of ground truth digital images having the first source comprises LiDAR depth data having scale and shift measurements; and further comprising determining a pair-wise normal loss from the first set of predicted depth images by utilizing a pair-wise normal loss function to compare depth data in the first set of predicted depth images to the LiDAR depth data from the first set of ground truth digital images of the first set of ground truth digital images. 7. The method of claim 1 , wherein the second set of ground truth digital images having the second source comprises calibrated stereo-image-based depth data; and further comprising determining an image-level normalized regression loss from the second set of predicted depth images by utilizing an image-level normalized regression loss function to compare depth data in the second set of predicted depth images to the calibrated stereo-image-based depth data from the second set of ground truth digital images of the second set of ground truth digital images. 8. The method of claim 1 , further comprising building the depth prediction machine-learning model by: generating, utilizing the depth prediction machine-learning model, a third set of predicted depth images from a third set of ground truth digital images having a third source, wherein the third set of ground truth digital images comprises uncalibrated stereo-image-based depth data generated at the third source; determining a third measure of loss from the third set of predicted depth images utilizing a third loss function that determines a structure-guided ranking loss by comparing the third set of predicted depth images to the third set of ground truth digital images; and tuning the depth prediction machine-learning model utilizing the third measure of loss. 9. The method of claim 1 , further comprising: identifying a query digital image; and generating a depth map from the query digital image utilizing the depth prediction machine-learning model. 10. A system comprising: a memory component; and one or more processing devices coupled to the memory component, the one or more processing devices to perform operations comprising: generating, utilizing a depth prediction machine-learning model, a first set of predicted depth images from a first set of ground truth digital images having a first source and a second set of predicted depth images from a second set of ground truth digital images having a second source; determining a first measure of loss from the first set of predicted depth images utilizing a first loss function by utilizing an image-level normalized regression loss function or a pair-wise normal loss function; determining a second measure of loss from the second set of predicted depth images utilizing a second loss function that differs from the first loss function; and tuning the depth prediction machine-learning model utilizing the first measure of loss and the second measure of loss. 11. The system of claim 10 , wherein the operations further comprise: generating the first set of predicted depth images by generating a set of depth maps via a first output path of the depth prediction machine-learning model; and generating the second set of predicted depth images by generating a set of inverse depth maps via a second output path of the depth prediction machine-learning model. 12. The system of claim 11 , wherein the first set of ground truth digital images comprises a first set of digital images and a set of ground truth depth maps, and wherein the second set of ground truth digital images comprises a second set of digital images and a set of ground truth inverse depth maps; and further comprising: determining the first measure of loss by comparing the set of depth maps to the set of ground truth depth maps from the first set of ground truth digital images utilizing the first loss function; and determining the second measure of loss by comparing the set of inverse depth maps to the set of ground truth inverse depth maps from the second set of ground truth digital images utilizing the second loss function. 13. The system of claim 10 , wherein the operations further comprise generating the second measure of loss utilizing a set of loss functions comprising the image-level normalized regression loss function, the pair-wise normal loss function, and a multi-scale gradient loss function. 14. The system of claim 10 , wherein the first set of ground truth digital images having the first source comprises LiDAR depth data having scale and shift measurements; and further comprising determining a pair-wise normal loss from the first set of predicted depth images by utilizing a pair-wise normal loss function to compare depth data in the first set of predicted depth images to the LiDAR depth data from the first set of ground truth digital images of the first set of ground truth digital images. 15. The system of claim 10 , wherein the second set of ground truth digital images having the second source comprises calibrated stereo-image-based depth data; and further comprising determining an image-level normali
Depth or shape recovery · CPC title
Edge detection · CPC title
involving probabilistic approaches, e.g. Markov random field [MRF] modelling · CPC title
Determination of transform parameters for the alignment of images, i.e. image registration · CPC title
from laser ranging, e.g. using interferometry; from the projection of structured light · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.