System and methods for depth estimation

US12530788B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12530788-B2
Application numberUS-202017786065-A
CountryUS
Kind codeB2
Filing dateDec 24, 2020
Priority dateDec 27, 2019
Publication dateJan 20, 2026
Grant dateJan 20, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system includes a computing device. The computing device is configured to perform a set of functions. The set of functions includes receiving an image, wherein the image comprises a two-dimensional array of data. The set of functions includes extracting, by a two-dimensional neural network, a plurality of two-dimensional features from the two-dimensional array of data. The set of functions includes generating a linear combination of the plurality of two-dimensional features to form a single three-dimensional input feature. The set of functions includes extracting, by a three-dimensional neural network, a plurality of three-dimensional features from the single three-dimensional input feature. The set of functions includes determining a two-dimensional depth map. The two-dimensional depth map contains depth information corresponding to the plurality of three-dimensional features.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system for determining a two-dimensional depth map from a single monocular image, the system comprising: a computing device, wherein the computing device is configured to perform a set of functions comprising: receiving a single monocular image, wherein the image comprises a two-dimensional array of data; extracting, by a two-dimensional neural network, a plurality of two-dimensional features from the two-dimensional array of data; generating a linear combination of the plurality of two-dimensional features to form a single three-dimensional input feature; extracting, by a three-dimensional neural network, a plurality of three-dimensional features from the single three-dimensional input feature; and determining a two-dimensional depth map, wherein the two-dimensional depth map contains depth information corresponding to the plurality of three-dimensional features. 2 . The system of claim 1 , wherein the computing device is a first computing device of a plurality of computing devices, and wherein the two-dimensional neural network and the three-dimensional neural network correspond to at least a second computing device of the plurality of computing devices. 3 . The system of claim 1 , wherein the two-dimensional neural network comprises a two-dimensional convolutional neural network, wherein the three-dimensional neural network comprises a three-dimensional convolutional neural network, and wherein extracting the plurality of two-dimensional features comprises using the two-dimensional convolutional neural network as a two-dimensional filter that operates in two directions within the two-dimensional array of data to output the plurality of two-dimensional features. 4 . The system of claim 3 , the set of functions further comprising: prior to extracting the plurality of two-dimensional features, training the two-dimensional convolutional neural network using a plurality of images representing objects such that different nodes within the two-dimensional convolutional neural network operate to output different types of two-dimensional features corresponding to different objects. 5 . The system of claim 1 , wherein generating the linear combination of the plurality of two-dimensional features to form the single three-dimensional input feature comprises: classifying a two-dimensional feature of the plurality of two-dimensional features in accordance with an object associated with training the two-dimensional convolutional neural network; and generating the linear combination of the plurality of two-dimensional features based on classifying the two-dimensional feature. 6 . The system of claim 1 , wherein the two-dimensional neural network comprises a two-dimensional convolutional neural network, wherein the three-dimensional neural network comprises a three-dimensional convolutional neural network, and wherein extracting the plurality of three-dimensional features comprises using the three-dimensional convolutional neural network as a three-dimensional filter that operates in three directions within the three-dimensional input feature to output the plurality of three-dimensional features. 7 . The system of claim 1 , wherein the two-dimensional neural network comprises a two-dimensional convolutional neural network, wherein the three-dimensional neural network comprises a three-dimensional convolutional neural network, and wherein extracting the plurality of three-dimensional features from the single three-dimensional input feature comprises extracting a plurality of sets of voxels, wherein each voxel indicates a level of opaqueness. 8 . A method for determining a two-dimensional depth map from a single monocular image, the method comprising: receiving a single monocular image, wherein the image comprises a two-dimensional array of data; extracting, by a two-dimensional neural network, a plurality of two-dimensional features from the two-dimensional array of data; generating a linear combination of the plurality of two-dimensional features to form a single three-dimensional input feature; extracting, by a three-dimensional neural network, a plurality of three-dimensional features from the single three-dimensional input feature; and determining a two-dimensional depth map, wherein the two-dimensional depth map contains depth information corresponding to the plurality of three-dimensional features. 9 . The method of claim 8 , further comprising: determining, based on the plurality of three-dimensional features extracted by the three-dimensional neural network, a three-dimensional array of voxels, each voxel of the array indicating a respective level of opaqueness, and wherein determining a two-dimensional depth map comprises, for a plurality of pixels of the two-dimensional depth map, determining respective distances between a capture device location and a respective closest opaque voxel of the array of voxels along a respective different path from the capture device location. 10 . The method of claim 8 , wherein each two-dimensional feature extracted by the two-dimensional neural network represents a respective different objects in the image. 11 . The method of claim 10 , wherein generating the linear combination of the plurality of two-dimensional features to form the single three-dimensional input feature comprises ordering the two-dimensional features based on overlap between the respective different objects in the image. 12 . The method of claim 8 , wherein the two-dimensional neural network comprises a two-dimensional convolutional neural network, wherein the three-dimensional neural network comprises a three-dimensional convolutional neural network, and wherein extracting the plurality of two-dimensional features comprises using the two-dimensional convolutional neural network as a two-dimensional filter that operates in two directions within the two-dimensional array of data to output the plurality of two-dimensional features. 13 . The method of claim 12 , further comprising: prior to extracting the plurality of two-dimensional features, training the two-dimensional convolutional neural network using a plurality of images representing objects such that different nodes within the two-dimensional convolutional neural network operate to output different types of two-dimensional features corresponding to different objects. 14 . The method of claim 13 , wherein generating the linear combination of the plurality of two-dimensional features to form the single three-dimensional input feature comprises: classifying a two-dimensional feature of the plurality of two-dimensional features in accordance with an object associated with training the two-dimensional convolutional neural network; and generating the linear combination of the plurality of two-dimensional features based on classifying the two-dimensional feature. 15 . The method of claim 8 , wherein the two-dimensional neural network comprises a two-dimensional convolutional neural network, wherein the three-dimensional neural network comprises a three-dimensional convolutional neural network, and wherein extracting the plurality of three-dimensional features comprises using the three-dimensional convolutional neural network as a three-dimensional filter that operates in three directions within the three-dimensional input feature to output the plurality of three-dimensional features. 16 . The method of claim 8 , wherein extracting the plurality of three-dimensional features from the single three-dimensional input feature comprises extracting a plurality of sets of voxels, wherein ea

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12530788B2 cover?
A system includes a computing device. The computing device is configured to perform a set of functions. The set of functions includes receiving an image, wherein the image comprises a two-dimensional array of data. The set of functions includes extracting, by a two-dimensional neural network, a plurality of two-dimensional features from the two-dimensional array of data. The set of functions in…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06T7/50. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 20 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).