Depth from motion for augmented reality for handheld user devices

US11145075B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11145075-B2
Application numberUS-201916767401-A
CountryUS
Kind codeB2
Filing dateOct 4, 2019
Priority dateOct 4, 2018
Publication dateOct 12, 2021
Grant dateOct 12, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A handheld user device includes a monocular camera to capture a feed of images of a local scene and a processor to select, from the feed, a keyframe and perform, for a first image from the feed, stereo matching using the first image, the keyframe, and a relative pose based on a pose associated with the first image and a pose associated with the keyframe to generate a sparse disparity map representing disparities between the first image and the keyframe. The processor further is to determine a dense depth map from the disparity map using a bilateral solver algorithm, and process a viewfinder image generated from a second image of the feed with occlusion rendering based on the depth map to incorporate one or more virtual objects into the viewfinder image to generate an AR viewfinder image. Further, the processor is to provide the AR viewfinder image for display.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for providing an augmented reality (AR) experience at a handheld user device, the method comprising: capturing, via the handheld user device, a feed of images of a local scene; selecting, from the feed, a keyframe; performing, for a first image from the feed of images, stereo matching using the first image, the keyframe, and a relative pose based on a pose associated with the first image and a pose associated with the keyframe to generate a sparse disparity map representing disparities between the first image and the keyframe; determining a dense depth map from the disparity map using a bilateral solver algorithm; processing a viewfinder image generated from a second image of the feed with occlusion rendering based on the depth map to incorporate one or more virtual objects into the viewfinder image to generate an AR viewfinder image; and displaying, at the handheld user device, the AR viewfinder image. 2. The method of claim 1 , further comprising: polar rectifying the keyframe and the first image; and wherein performing stereo matching comprising performing stereo matching using polar rectified representations of the keyframe and the first image. 3. The method of either of claim 1 , wherein determining the depth map comprises: generating a sparse depth map from the disparity map using triangulation; applying the bilateral solver algorithm to the sparse depth map to generate a bilateral grid of depths; and slicing the bilateral grid of depths with the second image to generate the depth map. 4. The method of claim 3 , wherein the bilateral solver algorithm comprises a planar bilateral solver algorithm that is based on plane-fitting each pixel in the sparse depth map. 5. The method of claim 1 , wherein selecting the keyframe comprises: selecting the keyframe based on minimization of a cost function that implements at least one of: a baseline distance between two candidate frames, a time difference between the capture of two candidate frames, a fractional overlap of image areas of two candidate frames based on viewing frustums, and a measured error of pose-tracking statistics for two candidate frames. 6. The method of claim 1 , wherein selecting the keyframe, performing stereo matching, determining the depth map, and processing the viewfinder image are performed in real-time by a central processing unit (CPU) of the handheld user device. 7. A handheld user device, comprising: a monocular camera to capture a feed of images of a local scene; a display panel; a memory to store a software application; and a processor coupled to the memory and to the monocular camera, wherein the processor is to execute instructions of the software application to: select, from the feed, a keyframe; perform, for a first image from the feed of images, stereo matching using the first image, the keyframe, and a relative pose based on a pose associated with the first image and a pose associated with the keyframe to generate a sparse disparity map representing disparities between the first image and the keyframe; determine a dense depth map from the disparity map using a bilateral solver algorithm; process a viewfinder image generated from a second image of the feed with occlusion rendering based on the depth map to incorporate one or more virtual objects into the viewfinder image to generate an AR viewfinder image; and provide the AR viewfinder image for display at the display panel. 8. The handheld user device of claim 7 , wherein the processor is to execute instructions of the software application further to: polar rectify the keyframe and the first image; and the stereo matching uses polar rectified representations of the keyframe and the first image. 9. The handheld user device of claim 7 , wherein the processor is to determine the depth map by: generating a sparse depth map from the disparity map using triangulation; applying the bilateral solver algorithm to the sparse depth map to generate a bilateral grid of depths; and slicing the bilateral grid of depths with the second image to generate the depth map. 10. The handheld user device of claim 9 , wherein the bilateral solver algorithm comprises a planar bilateral solver algorithm that is based on plane-fitting each pixel in the sparse depth map. 11. The handheld user device of claim 7 , wherein the processor is to select the keyframe based on minimization of a cost function that implements at least one of: a baseline distance between two candidate frames; a time difference between the capture of two candidate frames; a fractional overlap of image areas of two candidate frames based on viewing frustums; and a measured error of pose-tracking statistics for two candidate frames. 12. The handheld user device of claim 7 , wherein the processor is a central processing unit (CPU). 13. The handheld user device of claim 12 , wherein the CPU is to determine the depth map and process the viewfinder image in real-time. 14. The handheld user device of claim 7 , wherein the handheld user device is one of a compute-enabled cellular phone, a tablet computer, and a portable gaming device. 15. A non-transitory computer-readable storage medium storing a set of executable instructions, the set of executable instructions configured to manipulate a processor of a handheld user device to: select a keyframe from a feed of images captured by a monocular camera; perform, for a first image from the feed of images, stereo matching using the first image, the keyframe, and a relative pose based on a pose associated with the first image and a pose associated with the keyframe to generate a sparse disparity map representing disparities between the first image and the keyframe; determine a dense depth map from the disparity map using a bilateral solver algorithm; process a viewfinder image generated from a second image of the feed with occlusion rendering based on the depth map to incorporate one or more virtual objects into the viewfinder image to generate an AR viewfinder image; and provide the AR viewfinder image for display at a display panel of the handheld user device. 16. The non-transitory computer-readable storage medium of claim 15 , wherein the executable instructions are configured to manipulate the processor further to: polar rectify the keyframe and the first image; and the stereo matching uses polar rectified representations of the keyframe and the first image. 17. The non-transitory computer-readable storage medium of claim 15 , wherein the executable instructions are to manipulate the processor to determine the depth map by: generating a sparse depth map from the disparity map using triangulation; applying the bilateral solver algorithm to the sparse depth map to generate a bilateral grid of depths; and slicing the bilateral grid of depths with the second image to generate the depth map. 18. The non-transitory computer-readable storage medium of claim 17 , wherein the bilateral solver algorithm comprises a planar bilateral solver algorithm that is based on plane-fitting each pixel in the sparse depth map. 19. The non-transitory computer-readable storage medium of claim 15 , wherein the processor is to select the keyframe based on minimization of a cost function that implements at least one of: a baseline distance between two candidate frames; a time difference between the capture of two candidate frames; a fractional overlap of image areas of two candidate frames based on viewing frustums; and a measured error of pose-tracking statistics fo

Assignees

Inventors

Classifications

  • G06T7/579Primary

    from motion · CPC title

  • from focus · CPC title

  • Mixed reality (object pose determination, tracking or camera calibration for mixed reality G06T7/00) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11145075B2 cover?
A handheld user device includes a monocular camera to capture a feed of images of a local scene and a processor to select, from the feed, a keyframe and perform, for a first image from the feed, stereo matching using the first image, the keyframe, and a relative pose based on a pose associated with the first image and a pose associated with the keyframe to generate a sparse disparity map repres…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06T7/579. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 12 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).