What technology area does this patent fall under?

Primary CPC classification G06T7/593. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Real-time stereo matching using a hierarchical iterative refinement network

US11810313B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11810313-B2
Application number	US-202117249095-A
Country	US
Kind code	B2
Filing date	Feb 19, 2021
Priority date	Feb 21, 2020
Publication date	Nov 7, 2023
Grant date	Nov 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to an aspect, a real-time active stereo system includes a capture system configured to capture stereo data, where the stereo data includes a first input image and a second input image, and a depth sensing computing system configured to predict a depth map. The depth sensing computing system includes a feature extractor configured to extract features from the first and second images at a plurality of resolutions, an initialization engine configured to generate a plurality of depth estimations, where each of the plurality of depth estimations corresponds to a different resolution, and a propagation engine configured to iteratively refine the plurality of depth estimations based on image warping and spatial propagation.

First claim

Opening claim text (preview).

What is claimed is: 1. A real-time active stereo system comprising: a capture system configured to capture stereo data, the stereo data including a first input image and a second input image; and a depth sensing computing system configured to predict a depth map, the depth sensing computing system including: a feature extractor configured to extract features from the first and second input images at a plurality of resolutions; an initialization engine configured to generate a plurality of depth estimations, each of the plurality of depth estimations corresponding to a different resolution and including a three-dimensional (3D) slanted plane hypothesis for a region of a respective depth estimation, the 3D slanted plane hypothesis including a disparity value and a location of a slanted plane; and a propagation engine configured to iteratively refine the plurality of depth estimations based on image warping and spatial propagation. 2. The real-time active stereo system of claim 1 , wherein the initialization engine is configured to predict a first depth estimation based on a matching of the features from the first and second input images at a first resolution, the initialization engine configured to predict a second depth estimation based on a matching of the features from the first and second input images at a second resolution. 3. The real-time active stereo system of claim 2 , wherein the propagation engine is configured to predict, via a first iteration, a refined first depth estimation using the first depth estimation from the initialization engine and the features at the first resolution from the feature extractor, the propagation engine configured to predict, via a second iteration, a refined second depth estimation based on the refined first depth estimation from the first iteration, and the second depth estimation from the initialization engine, the refined second depth estimation being used in a subsequent iteration or as a basis for the depth map. 4. The real-time active stereo system of claim 1 , wherein the initialization engine includes a region feature extractor configured to extract first per-region features using features from the first input image and extract second per-region features using features from the second input image, the initialization engine including a matching engine configured to generate a depth estimation based on a matching of the first per-region features with the second per-region features. 5. The real-time active stereo system of claim 1 , wherein the 3D slanted plane hypothesis includes a feature descriptor that represents information about the slanted plane. 6. The real-time active stereo system of claim 5 , further comprising: a neural network configured to generate the feature descriptor based on costs per region. 7. The real-time active stereo system of claim 1 , wherein the propagation engine includes a warping module configured to generate warped features by warping features of the first input image using a depth estimation received from the initialization engine, a matching engine configured to compute a local cost volume based on a matching of the warped features with features from the second input image, and a convolutional neural network (CNN) module configured to generate a refined depth estimation based on plane hypotheses of the depth estimation and the local cost volume. 8. The real-time active stereo system of claim 7 , wherein the CNN module includes one or more residual blocks configured to apply one or more dilation convolutions. 9. A method for real-time stereo matching comprising: extracting, by a feature extractor, features from a first input image and a second input image at a plurality of resolutions including a first resolution and a second resolution; and generating, by an initialization engine, a plurality of depth estimations at the plurality of resolutions, including: predicting a first depth estimation based on a matching of the features from the first and second input images at the first resolution, the first depth estimation including a three-dimensional (3D) slanted plane hypothesis for each region of a respective depth estimation, the 3D slanted plane hypothesis including a disparity value and a location of a slanted plane; and predicting a second depth estimation based on a matching of the features from the first and second input images at the second resolution; and iteratively refining, by a propagation engine, the plurality of depth estimations based on image warping and spatial propagation, including: predicting, via a first iteration, a refined first depth estimation using the first depth estimation and the features at the first resolution; and predicting, via a second iteration, a refined second depth estimation based on the refined first depth estimation from the first iteration and the second depth estimation, the refined second depth estimation being used in a subsequent iteration or as a basis for a depth map. 10. The method of claim 9 , wherein the 3D slanted plane hypothesis includes a feature descriptor that represents information about the slanted plane. 11. The method of claim 9 , wherein the predicting the first depth estimation includes: extracting, by at least one first convolutional block, first per-region features for each image region using features of the first input image at the first resolution; extracting, by at least one second convolutional block, second per-region features for each image region using features of the second input image at the first resolution; and selecting, by a matching engine, the 3D slanted plane hypothesis for each region having a disparity value with a lowest cost. 12. The method of claim 11 , further comprising: constructing a 3D cost volume based on costs per region, wherein the 3D slanted plane hypothesis is selected based on the costs per region, wherein the 3D cost volume is not stored or used by the propagation engine. 13. The method of claim 12 , wherein the 3D slanted plane hypothesis includes a feature descriptor that describes information about a slanted plane, further comprising: generating, by a neural network, the feature descriptor based on the costs per region and at least one of the first per-region features or the second per-region features. 14. The method of claim 11 , wherein the at least one first convolutional block includes a convolutional block having a stride value that is different from a convolutional block of the at least one second convolutional block. 15. The method of claim 9 , wherein the predicting the refined first depth estimation includes: generating warped features by warping features from the first input image at the first resolution using the first depth estimation; computing a local cost volume based on a matching of the warped features with features of the second input image at the first resolution; obtaining an augmented depth estimation based on the local cost volume and the first depth estimation; and predicting, by a convolution neural network (CNN) module, the refined first depth estimation using the augmented depth estimation. 16. The method of claim 15 , wherein computing the local cost volume includes: displacing disparities in a respective region by an offset value; and computing costs for the respective region. 17. The method of claim 15 , wherein the CNN module includes a plurality of residual blocks including a first residual block and a second residual block, at least one of the first residual block or the second residual block defining one or more dilated convolutions.

Assignees

Google Llc

Inventors

Classifications

G06T7/593Primary
from stereo images · CPC title
G06T3/0093
Physics · mapped topic
G06T3/40
Scaling of whole images or parts thereof, e.g. expanding or contracting · CPC title
G06T5/30
Erosion or dilatation, e.g. thinning · CPC title
H04N13/20
Image signal generators · CPC title

Patent family

Related publications grouped by family.

View patent family 74871856

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11810313B2 cover?: According to an aspect, a real-time active stereo system includes a capture system configured to capture stereo data, where the stereo data includes a first input image and a second input image, and a depth sensing computing system configured to predict a depth map. The depth sensing computing system includes a feature extractor configured to extract features from the first and second images at…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G06T7/593. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Depth from motion for augmented reality for handheld user devices

System and method for active stereo depth sensing

Learning-based matching for active stereo systems

Hierarchical disparity hypothesis generation with slanted support windows

System and method for active stereo depth sensing

Frequently asked questions