Stereo depth estimation
US-12169943-B2 · Dec 17, 2024 · US
US2021358155A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2021358155-A1 |
| Application number | US-202015931541-A |
| Country | US |
| Kind code | A1 |
| Filing date | May 13, 2020 |
| Priority date | May 13, 2020 |
| Publication date | Nov 18, 2021 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are provided for performing temporally consistent depth map generation by implementing acts of obtaining a first stereo pair of images of a scene associated with a first timepoint and a first pose, generating a first depth map of the scene based on the first stereo pair of images, obtaining a second stereo pair of images of the scene associated with at a second timepoint and a second pose, generating a reprojected first depth map by reprojecting the first depth map to align the first depth map with the second stereo pair of images, and generating a second depth map that corresponds to the second stereo pair of images using the reprojected first depth map.
Opening claim text (preview).
What is claimed is: 1 . A system for generating temporally consistent depth maps, comprising: one or more processors; and one or more hardware storage devices having stored computer-executable instructions that are operable, when executed by the one or more processors, to cause the system to: obtain a first stereo pair of images of a scene captured at a first timepoint and with a first pose associated with the system; generate a first depth map of the scene based on the first stereo pair of images; obtain a second stereo pair of images of the scene, the second stereo pair of images being captured at a second timepoint and with a second pose associated with the system; generate a reprojected first depth map by reprojecting the first depth map to align the first depth map with the second stereo pair of images; and generate a second depth map that corresponds to the second stereo pair of images by performing stereo matching on the second stereo pair of images using the reprojected first depth map, thereby improving temporal consistency of the first depth map with the second depth map, and the overall depth map quality. 2 . The system of claim 1 , further comprising: a stereo pair of cameras, wherein the stereo pair of cameras captures the first stereo pair of images and the second stereo pair of images. 3 . The system of claim 1 , further comprising: a head tracking system, comprising: at least one head tracking camera; an accelerometer; a gyroscope; and a magnetometer, wherein the first pose associated with the system and the second pose associated with the system are based on measurements obtained by the head tracking system. 4 . The system of claim 1 , wherein the first pose associated with the system is different than the second pose associated with the system. 5 . The system of claim 1 , wherein generating the second depth map using the reprojected first depth map includes implementing a temporal consistency term into a cost function for performing stereo matching on the second stereo pair of images. 6 . The system of claim 5 , wherein the temporal consistency term applies a cost bonus for pixels of the second depth map that share a same or similar disparity value with corresponding pixels of the reprojected first depth map. 7 . The system of claim 1 , wherein the first stereo pair of images is a downsampled first stereo pair of images and the first depth map of the scene is based on the downsampled first stereo pair of images, and wherein the second stereo pair of images is a downsampled second stereo pair of images and the second depth map is generated by performing stereo matching on the downsampled second stereo pair of images. 8 . The system of claim 1 , wherein the first depth map of the scene is one of a plurality of first depth maps of the scene, each of the plurality of first depth maps having a different image size. 9 . The system of claim 8 , wherein the second stereo pair of images is one of a plurality of second stereo pairs of images, the second stereo pair of images having a lowest image size of the plurality of second stereo pairs of images. 10 . The system of claim 9 , wherein the computer-executable instructions are further operable to cause the system to: generate an upsampled second depth map by applying an edge-preserving filter to the second depth map, wherein the edge-preserving filter utilizes the second depth map, at least one of the plurality of first depth maps, and at least one of the plurality of second stereo pairs of images to generate the upsampled second depth map. 11 . The system of claim 10 , wherein the edge-preserving filter is a joint bilateral filter. 12 . The system of claim 1 , wherein the computer-executable instructions are further operable, when executed by the one or more processors, to cause the system to: reproject depth points based on the second depth map to correspond to a user perspective. 13 . A method for generating temporally consistent depth maps, comprising: obtaining a first stereo pair of images of a scene captured at a first timepoint and with a first pose associated with a computer system; generating a first depth map of the scene based on the first stereo pair of images; obtaining a second stereo pair of images of the scene, the second stereo pair of images being captured at a second timepoint and with a second pose associated with the computer system; generating a reprojected first depth map by reprojecting the first depth map to align the first depth map with the second stereo pair of images; and generating a second depth map that corresponds to the second stereo pair of images by performing stereo matching on the second stereo pair of images using the reprojected first depth map, thereby improving temporal consistency of the first depth map with the second depth map, and the overall depth map quality. 14 . The method of claim 13 , wherein the first pose associated with the computer system is different than the second pose associated with the computer system. 15 . The method of claim 13 , wherein generating the second depth map using the reprojected first depth map includes implementing a temporal consistency term into a cost function for performing stereo matching on the second stereo pair of images. 16 . The method of claim 15 , wherein the temporal consistency term applies a cost bonus for pixels of the second depth map that share a same or similar disparity value with corresponding pixels of the reprojected first depth map. 17 . The method of claim 13 , wherein the first depth map of the scene is one of a plurality of first depth maps of the scene, each of the plurality of first depth maps having a different image size. 18 . The method of claim 17 , wherein the second stereo pair of images is one of a plurality of second stereo pairs of images, the second stereo pair of images having a lowest image size of the plurality of second stereo pairs of images. 19 . The method of claim 18 , further comprising: generating an upsampled second depth map by applying an edge-preserving filter to the second depth map, wherein the edge-preserving filter utilizes the second depth map, at least one of the plurality of first depth maps, and at least one of the plurality of second stereo pairs of images to generate the upsampled second depth map. 20 . One or more hardware storage devices having stored thereon computer-executable instructions, the computer-executable instructions being executable by one or more processors of a computer system to cause the computer system to: obtain a first stereo pair of images of a scene captured at a first timepoint and with a first pose associated with the computer system; generate a first depth map of the scene based on the first stereo pair of images; obtain a second stereo pair of images of the scene, the second stereo pair of images being captured at a second timepoint and with a second pose associated with the computer system; generate a reprojected first depth map by reprojecting the first depth map to align the first depth map with the second stereo pair of images; and generate a second depth map that corresponds to the second stereo pair of images by performing stereo matching on the second stereo pair of images using the reprojected first depth map, thereby improving temporal consistency of the first depth map with the second depth map, and the overall depth map quality.
Edge-driven scaling; Edge-based scaling · CPC title
from stereo images · CPC title
Stereoscopic video; Stereoscopic image sequence · CPC title
Determining position or orientation of objects or cameras (camera calibration G06T7/80) · CPC title
Adjusting depth or disparity · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.