Online coupled camera pose estimation and dense reconstruction from video

US9483703B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9483703-B2
Application numberUS-201414120370-A
CountryUS
Kind codeB2
Filing dateMay 14, 2014
Priority dateMay 14, 2013
Publication dateNov 1, 2016
Grant dateNov 1, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A product may receive each image in a stream of video image of a scene, and before processing the next image, generate information indicative of the position and orientation of an image capture device that captured the image at the time of capturing the image. The product may do so by identifying distinguishable image feature points in the image; determining a coordinate for each identified image feature point; and for each identified image feature point, attempting to identify one or more distinguishable model feature points in a three dimensional (3D) model of at least a portion of the scene that appears likely to correspond to the identified image feature point. Thereafter, the product may find each of the following that, in combination, produce a consistent projection transformation of the 3D model onto the image: a subset of the identified image feature points for which one or more corresponding model feature points were identified; and, for each image feature point that has multiple likely corresponding model feature points, one of the corresponding model feature points. The product may update a 3D model of at least a portion of the scene following the receipt of each video image and before processing the next video image base on the generated information indicative of the position and orientation of the image capture device at the time of capturing the received image. The product may display the updated 3D model after each update to the model.

First claim

Opening claim text (preview).

The invention claimed is: 1. A product comprising a non-transitory, tangible, computer-readable storage medium containing a program of instructions that causes a computer system running the program of instructions to cause at least the following to occur: receive a stream of video images of a scene, each image having been captured by an image capture device while located at a particular position and having a particular orientation, at least two of the images having been captured by the image capture device while at different locations; after receiving each image and before processing the next image, generate information indicative of the position and orientation of the image capture device at the time of capturing the image and update a three dimensional (3D) model by performing at least the following: identifying distinguishable image feature points in the image; for each identified image feature point, attempting to identify one or more distinguishable model feature points in a three dimensional (3D) model of at least a portion of the scene that appears likely to correspond to the identified image feature point, where the correspondence is determined by a matching algorithm that performs at least the following: back-projects the feature point in the three dimensional (3D) model onto the previously-received image; finds an estimated pixel location on the current image using dense optical flow; and searches near the estimate pixel location to find the matched image feature point; finding each of the following that, in combination, produce a consistent projection transformation of the 3D model onto the image: a subset of identified image feature points for which one or more corresponding model feature points were identified; and for each image feature point that has multiple likely corresponding model feature points, one of the corresponding model feature points; and updating the three dimensional (3D) model by using the projection transformation of the current image to estimate geometry information. 2. The product of claim 1 wherein the product has a configuration that uses information from one or more inertial sensors to do the finding step. 3. The product of claim 1 wherein the product has a configuration that displays the updated 3D model after each update to the model. 4. The product of claim 1 wherein the product has a configuration that identifies a virtual ground plane of the scene and estimates an orientation of a normal to the virtual ground plane and a position of the virtual ground plane. 5. The product of claim 4 wherein the product has a configuration that produces a 2.5-dimensional digital surface model (DSM) and that includes information indicative of the altitude of components in the DSM above the virtual ground plane. 6. The product of claim 4 wherein the product has a configuration that rectifies images regarding the virtual ground plane to filter out parallax from camera motion and computes optical flow between rectified images. 7. The product of claim 1 wherein the product has a configuration that infers dense three dimensional (3D) geometric information about the scene based on at least a portion of the stream of video images and the information indicative of the position and orientation of the image capture device at the time of capturing at least two of the received video images. 8. The product of claim 7 wherein the product has a configuration that: identifies a virtual ground plane of the scene and estimates an orientation of a normal to the virtual ground plane and a position of the virtual ground plane; and infers the dense 3D geometric information by estimating a height map of values that represent altitudes above the virtual ground plane. 9. The product of claim 7 wherein the product has a configuration that produces a dense 3D model of the scene based on the dense 3D geometric information. 10. The product of claim 9 wherein the product has a configuration that: produces a 2.5-dimensional digital surface model (DSM) and that includes information indicative of the altitude of components in the DSM above the virtual ground plane; and produces a dense 3D polygon model based on the dense 3D geometric information using a volumetric reconstruction method with the volume size being based on the 2.5-dimensional digital surface model.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9483703B2 cover?
A product may receive each image in a stream of video image of a scene, and before processing the next image, generate information indicative of the position and orientation of an image capture device that captured the image at the time of capturing the image. The product may do so by identifying distinguishable image feature points in the image; determining a coordinate for each identified ima…
Who is the assignee on this patent?
Medioni Gerard, Kang Zhuoliang, Univ Southern California
What technology area does this patent fall under?
Primary CPC classification G06V20/64. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 01 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).