Edge-guided ranking loss for monocular depth prediction
US-2021256717-A1 · Aug 19, 2021 · US
US12456046B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12456046-B2 |
| Application number | US-202016852944-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 20, 2020 |
| Priority date | Apr 20, 2020 |
| Publication date | Oct 28, 2025 |
| Grant date | Oct 28, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Apparatuses, systems, and techniques are presented to determine distance for one or more objects. In at least one embodiment, a disparity network is trained to determine distance data from input stereoscopic images using a loss function that includes at least one of a gradient loss term and an occlusion loss term.
Opening claim text (preview).
What is claimed is: 1. One or more processors, comprising: circuitry to use one or more neural networks to generate one or more disparity maps, and to determine depth of one or more objects in one or more stereoscopic images, based, at least in part, on a combination of gradient loss, occlusion loss, and disparity loss calculated using the one or more disparity maps. 2. The one or more processors of claim 1 , wherein the one or more neural networks are trained by minimizing the disparity loss, the gradient loss corresponding to one or more gradients of disparity, and the occlusion loss corresponding to one or more occlusions in the one or more stereoscopic images. 3. The one or more processors of claim 2 , wherein the one or more disparity maps comprises a pair of gradient maps in orthogonal directions with respect to the disparity, and wherein the pair of gradient maps are compared against ground truth gradient maps for determining the loss of the one or more gradients of the disparity. 4. The one or more processors of claim 1 , wherein the one or more neural networks include one or more pixel-adaptive convolution (PAC) layers to utilize one or more smoothness priors for estimating disparity. 5. The one or more processors of claim 1 , wherein the one or more neural networks generate one or more occlusion maps for the one or more stereoscopic images using rectified geometry of a camera that captured the one or more stereoscopic images. 6. The one or more processors of claim 1 , wherein the one or more neural networks utilize network parameters determined by minimizing one or more losses of the combination. 7. A system comprising: one or more processors to use one or more neural networks to generate one or more disparity maps, and to determine depth of one or more objects in one or more stereoscopic images, based, at least in part, on a combination of gradient loss, occlusion loss, and disparity loss calculated using the one or more disparity maps. 8. The system of claim 7 , wherein the one or more neural networks are trained by minimizing the disparity loss, the gradient loss corresponding to one or more gradients of disparity, and the occlusion loss corresponding to one or more occlusions in the one or more stereoscopic images. 9. The system of claim 8 , wherein the one or more disparity maps comprises a pair of gradient maps in orthogonal directions with respect to the disparity, and wherein the pair of gradient maps are compared against ground truth gradient maps for determining the loss of the one or more gradients of the disparity. 10. The system of claim 7 , wherein the one or more neural networks include one or more pixel-adaptive convolution (PAC) layers to utilize one or more smoothness priors for estimating disparity. 11. The system of claim 7 , wherein the one or more neural networks generate one or more occlusion maps for the one or more stereoscopic images using rectified geometry of a camera that captured the one or more stereoscopic images. 12. The system of claim 7 , wherein the one or more neural networks utilize network parameters determined by minimizing one or more losses of the combination. 13. A method comprising: generating one or more disparity maps using one or more neural networks; and determining depth of one or more objects in one or more stereoscopic images using the one or more neural networks, based, at least in part, on a combination of gradient loss, occlusion loss, and disparity loss calculated using the one or more disparity maps. 14. The method of claim 13 , wherein the one or more neural networks are trained by minimizing the disparity loss, the gradient loss corresponding to one or more gradients of disparity, and the occlusion loss corresponding to one or more occlusions in the one or more stereoscopic images. 15. The method of claim 14 , wherein the one or more disparity maps comprises a pair of gradient maps in orthogonal directions with respect to the disparity, and wherein the pair of gradient maps are compared against ground truth gradient maps for determining the loss of the one or more gradients of the disparity. 16. The method of claim 13 , wherein the one or more neural networks include one or more pixel-adaptive convolution (PAC) layers to utilize one or more smoothness priors for estimating disparity. 17. The method of claim 13 , wherein the one or more neural networks generate one or more occlusion maps for the one or more stereoscopic images using rectified geometry of a camera that captured the one or more stereoscopic images. 18. The method of claim 13 , wherein the one or more neural networks utilize network parameters determined by minimizing one or more losses of the combination. 19. A non-transitory machine-readable medium having stored thereon a set of instructions, when performed by one or more processors, cause the one or more processors to at least: generate one or more disparity maps using one or more neural networks; and determine depth of one or more objects in one or more stereoscopic images using the one or more neural networks, based, at least in part, on a combination of gradient loss, occlusion loss, and disparity loss calculated using the one or more disparity maps. 20. The non-transitory machine-readable medium of claim 19 , wherein the one or more neural networks are trained by minimizing the disparity loss, the gradient loss corresponding to one or more gradients of the disparity, and the occlusion loss corresponding to one or more occlusions in the one or more stereoscopic images. 21. The non-transitory machine-readable medium of claim 20 , wherein the one or more disparity maps comprises a pair of gradient maps in orthogonal directions with respect to the disparity, and wherein the pair of gradient maps are compared against ground truth gradient maps for determining the loss of the one or more gradients of the disparity. 22. The non-transitory machine-readable medium of claim 19 , wherein the one or more neural networks include one or more pixel-adaptive convolution (PAC) layers to utilize one or more smoothness priors for estimating disparity. 23. The non-transitory machine-readable medium of claim 19 , wherein the one or more neural networks generate one or more occlusion maps for the one or more stereoscopic images using rectified geometry of a camera that captured the one or more stereoscopic images. 24. The non-transitory machine-readable medium of claim 19 , wherein the one or more neural networks utilize network parameters determined by minimizing one or more losses of the combination. 25. A distance determination system, comprising: a camera to capture one or more stereoscopic images; one or more processors to use one or more neural networks to generate one or more disparity maps, and to determine depth of one or more objects in one or more stereoscopic images based, at least in part, on a combination of gradient loss, occlusion loss, and disparity loss calculated using the one or more disparity maps; and memory for storing network parameters for the one or more neural networks. 26. The distance determination system of claim 25 , wherein the one or more neural networks are trained by minimizing the disparity loss, the gradient loss corresponding to one or more gradients of the disparity, and the occlusion loss corresponding to one or more occlusions in the one or more stereoscopic images. 27. The dist
Architecture, e.g. interconnection topology · CPC title
Depth or disparity estimation from stereoscopic image signals · CPC title
Artificial neural networks [ANN] · CPC title
Training; Learning · CPC title
Stereoscopic video; Stereoscopic image sequence · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.