Distance determinations using one or more neural networks

US12456046B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12456046-B2
Application numberUS-202016852944-A
CountryUS
Kind codeB2
Filing dateApr 20, 2020
Priority dateApr 20, 2020
Publication dateOct 28, 2025
Grant dateOct 28, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Apparatuses, systems, and techniques are presented to determine distance for one or more objects. In at least one embodiment, a disparity network is trained to determine distance data from input stereoscopic images using a loss function that includes at least one of a gradient loss term and an occlusion loss term.

First claim

Opening claim text (preview).

What is claimed is: 1. One or more processors, comprising: circuitry to use one or more neural networks to generate one or more disparity maps, and to determine depth of one or more objects in one or more stereoscopic images, based, at least in part, on a combination of gradient loss, occlusion loss, and disparity loss calculated using the one or more disparity maps. 2. The one or more processors of claim 1 , wherein the one or more neural networks are trained by minimizing the disparity loss, the gradient loss corresponding to one or more gradients of disparity, and the occlusion loss corresponding to one or more occlusions in the one or more stereoscopic images. 3. The one or more processors of claim 2 , wherein the one or more disparity maps comprises a pair of gradient maps in orthogonal directions with respect to the disparity, and wherein the pair of gradient maps are compared against ground truth gradient maps for determining the loss of the one or more gradients of the disparity. 4. The one or more processors of claim 1 , wherein the one or more neural networks include one or more pixel-adaptive convolution (PAC) layers to utilize one or more smoothness priors for estimating disparity. 5. The one or more processors of claim 1 , wherein the one or more neural networks generate one or more occlusion maps for the one or more stereoscopic images using rectified geometry of a camera that captured the one or more stereoscopic images. 6. The one or more processors of claim 1 , wherein the one or more neural networks utilize network parameters determined by minimizing one or more losses of the combination. 7. A system comprising: one or more processors to use one or more neural networks to generate one or more disparity maps, and to determine depth of one or more objects in one or more stereoscopic images, based, at least in part, on a combination of gradient loss, occlusion loss, and disparity loss calculated using the one or more disparity maps. 8. The system of claim 7 , wherein the one or more neural networks are trained by minimizing the disparity loss, the gradient loss corresponding to one or more gradients of disparity, and the occlusion loss corresponding to one or more occlusions in the one or more stereoscopic images. 9. The system of claim 8 , wherein the one or more disparity maps comprises a pair of gradient maps in orthogonal directions with respect to the disparity, and wherein the pair of gradient maps are compared against ground truth gradient maps for determining the loss of the one or more gradients of the disparity. 10. The system of claim 7 , wherein the one or more neural networks include one or more pixel-adaptive convolution (PAC) layers to utilize one or more smoothness priors for estimating disparity. 11. The system of claim 7 , wherein the one or more neural networks generate one or more occlusion maps for the one or more stereoscopic images using rectified geometry of a camera that captured the one or more stereoscopic images. 12. The system of claim 7 , wherein the one or more neural networks utilize network parameters determined by minimizing one or more losses of the combination. 13. A method comprising: generating one or more disparity maps using one or more neural networks; and determining depth of one or more objects in one or more stereoscopic images using the one or more neural networks, based, at least in part, on a combination of gradient loss, occlusion loss, and disparity loss calculated using the one or more disparity maps. 14. The method of claim 13 , wherein the one or more neural networks are trained by minimizing the disparity loss, the gradient loss corresponding to one or more gradients of disparity, and the occlusion loss corresponding to one or more occlusions in the one or more stereoscopic images. 15. The method of claim 14 , wherein the one or more disparity maps comprises a pair of gradient maps in orthogonal directions with respect to the disparity, and wherein the pair of gradient maps are compared against ground truth gradient maps for determining the loss of the one or more gradients of the disparity. 16. The method of claim 13 , wherein the one or more neural networks include one or more pixel-adaptive convolution (PAC) layers to utilize one or more smoothness priors for estimating disparity. 17. The method of claim 13 , wherein the one or more neural networks generate one or more occlusion maps for the one or more stereoscopic images using rectified geometry of a camera that captured the one or more stereoscopic images. 18. The method of claim 13 , wherein the one or more neural networks utilize network parameters determined by minimizing one or more losses of the combination. 19. A non-transitory machine-readable medium having stored thereon a set of instructions, when performed by one or more processors, cause the one or more processors to at least: generate one or more disparity maps using one or more neural networks; and determine depth of one or more objects in one or more stereoscopic images using the one or more neural networks, based, at least in part, on a combination of gradient loss, occlusion loss, and disparity loss calculated using the one or more disparity maps. 20. The non-transitory machine-readable medium of claim 19 , wherein the one or more neural networks are trained by minimizing the disparity loss, the gradient loss corresponding to one or more gradients of the disparity, and the occlusion loss corresponding to one or more occlusions in the one or more stereoscopic images. 21. The non-transitory machine-readable medium of claim 20 , wherein the one or more disparity maps comprises a pair of gradient maps in orthogonal directions with respect to the disparity, and wherein the pair of gradient maps are compared against ground truth gradient maps for determining the loss of the one or more gradients of the disparity. 22. The non-transitory machine-readable medium of claim 19 , wherein the one or more neural networks include one or more pixel-adaptive convolution (PAC) layers to utilize one or more smoothness priors for estimating disparity. 23. The non-transitory machine-readable medium of claim 19 , wherein the one or more neural networks generate one or more occlusion maps for the one or more stereoscopic images using rectified geometry of a camera that captured the one or more stereoscopic images. 24. The non-transitory machine-readable medium of claim 19 , wherein the one or more neural networks utilize network parameters determined by minimizing one or more losses of the combination. 25. A distance determination system, comprising: a camera to capture one or more stereoscopic images; one or more processors to use one or more neural networks to generate one or more disparity maps, and to determine depth of one or more objects in one or more stereoscopic images based, at least in part, on a combination of gradient loss, occlusion loss, and disparity loss calculated using the one or more disparity maps; and memory for storing network parameters for the one or more neural networks. 26. The distance determination system of claim 25 , wherein the one or more neural networks are trained by minimizing the disparity loss, the gradient loss corresponding to one or more gradients of the disparity, and the occlusion loss corresponding to one or more occlusions in the one or more stereoscopic images. 27. The dist

Assignees

Inventors

Classifications

  • Architecture, e.g. interconnection topology · CPC title

  • Depth or disparity estimation from stereoscopic image signals · CPC title

  • Artificial neural networks [ANN] · CPC title

  • Training; Learning · CPC title

  • Stereoscopic video; Stereoscopic image sequence · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12456046B2 cover?
Apparatuses, systems, and techniques are presented to determine distance for one or more objects. In at least one embodiment, a disparity network is trained to determine distance data from input stereoscopic images using a loss function that includes at least one of a gradient loss term and an occlusion loss term.
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 28 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).