Real-time three-dimensional reconstruction of a scene from a single camera

US9779508B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9779508-B2
Application numberUS-201414226732-A
CountryUS
Kind codeB2
Filing dateMar 26, 2014
Priority dateMar 26, 2014
Publication dateOct 3, 2017
Grant dateOct 3, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A combination of three computational components may provide memory and computational efficiency while producing results with little latency, e.g., output can begin with the second frame of video being processed. Memory usage may be reduced by maintaining key frames of video and pose information for each frame of video. Additionally, only one global volumetric structure may be maintained for the frames of video being processed. To be computationally efficient, only depth information may be computed from each frame. Through fusion of multiple depth maps from different frames into a single volumetric structure, errors may average out over several frames, leading to a final output with high quality.

First claim

Opening claim text (preview).

What is claimed is: 1. A device comprising: a camera tracking circuit comprising an input that receives a sequence of images from a single camera at an input rate, the camera tracking circuit being operative to process each current image from the sequence of images at the input rate, and comprising a first output providing a pose for the current image, and a second output providing the current image for storing in a storage designating the current image as a key frame based on the pose for the image and poses for other stored key frames from the storage, and a third output providing an indication of one or more key frames other than the current image from the storage selected for the current image, wherein the pose, the indication of the one or more selected key frames and the designation of the current image as a key frame, are output successively at the input rate for each current image; a depth map estimation circuit having a first input that receives the current image from the sequence of images, and the indication of the one or more selected key frames for the current image, and a second input receiving the selected one or more key frames from the storage, and having an output providing a depth map for the current image based on the current image and the one or more selected key frames; and a volumetric fusion circuit having an input receiving the depth map for the current image and an output providing a three-dimensional model as a fusion of depth maps received from the depth map estimation circuit for the sequence of images; wherein the depth map estimation circuit and the volumetric fusion circuit further process inputs and provide outputs successively for each current image at the input rate. 2. The device of claim 1 , wherein the camera tracking circuit, the depth map estimation circuit, and the volumetric fusion circuit are implemented within a single general purpose integrated circuit. 3. The device of claim 1 , wherein the depth map for an image comprises a measure of depth for each pixel in the current image. 4. The device of claim 1 , wherein the pose for the current image comprises rotation and translation of the camera with respect to a fixed coordinate system. 5. The device of claim 1 , wherein the three-dimensional model is defined in a virtual volume, wherein an initial pose of the camera is defined at a point in the virtual volume. 6. The device of claim 1 , wherein the three-dimensional model is defined by representing a surface using a signed distance field in the virtual volume. 7. The device of claim 1 , wherein the volumetric fusion circuit outputs the three-dimensional model after processing as few as the first two images from the sequence of images. 8. A process for generating a three-dimensional model from a sequence of images from a single camera, the process comprising: receiving the sequence of images from the camera into memory at an input rate, to successively provide at the input rate a current image in memory for processing; successively processing the current image in the memory at the input rate by repeating the following steps for each current image: determining, using logic circuitry, a pose for the current image; determining, using logic circuitry, whether the current image is to be designated as a key frame based on the pose of the current image and poses for other stored key frames from storage; in response to determining that the current image is to be designated as a key frame, storing, the current image and the pose for the current image in the storage as a key frame; selecting, using logic circuitry, a key frame other than the current image, from among the other key frames stored in the storage from the sequence of images; computing, using logic circuitry, a depth map for the current image using the current image and the selected key frame for the current image; merging, in memory, the depth map computed for the current image into a volumetric representation of a scene represented in the sequence of images; and outputting, at the input rate, the volumetric representation of the scene from the memory. 9. The process of claim 8 , further comprising determining a scale of depth by a sensor. 10. The process of claim 8 , wherein the depth map determined for the current image comprises measures of depth for pixels in the current image. 11. The process of claim 8 , wherein the pose for the current image comprises rotation and translation of the camera with respect to a fixed coordinate system. 12. The process of claim 8 , wherein the three-dimensional model is defined in a virtual volume, wherein an initial pose of the camera is defined at a point in the virtual volume. 13. The process of claim 8 , wherein the three-dimensional model is defined by representing a surface using a signed distance field in the virtual volume. 14. The process of claim 8 , wherein outputting the three-dimensional model begins after processing as few as a first two images from the sequence of images. 15. A computer program product, comprising: a computer storage device; computer program instructions stored in the computer storage device that when read from the storage device and processed by a processor of a computer instruct the computer to perform a process for generating a three-dimensional model from a sequence of images from a single camera, the process comprising: receiving the sequence of images from the camera into memory at an input rate, to successively provide at the input rate a current image in memory for processing; successively processing the current image in the memory at the input rate by repeating the following steps for each current image: determining, using logic circuitry, a pose for the current image; determining, using logic circuitry, whether the current image is to be designated as a key frame based on the pose of the current image and poses for other stored key frames from storage; in response to determining that the current image is to be designated as a key frame, storing, the current image and the pose for the current image in the storage as a key frame; selecting, using logic circuitry, a key frame other than the current image, from among the other key frames stored in the storage from the sequence of images; computing, using logic circuitry, a depth map for the current image using the current image and the selected key frame for the current image; merging, in memory, the depth map computed for the current image into a volumetric representation of a scene represented in the sequence of images; and outputting, at the input rate, the volumetric representation of the scene from the memory. 16. The computer program product of claim 15 , wherein the processor is housed in a mobile device that incorporates the camera. 17. The computer program product of claim 15 , wherein the depth map for the current image comprises measures of depth for pixels in the current image. 18. The computer program product of claim 15 , wherein the pose for the current image comprises rotation and translation of the camera with respect to a fixed coordinate system. 19. The computer program product of claim 15 , wherein the three-dimensional model is defined in a virtual volume, wherein an initial pose of the camera is defined at a point in the virtual volume. 20. The computer program product of claim 15 , wherein the three-dimensional model is defined by representing a surface using a signed distance field in the virtual volume.

Assignees

Inventors

Classifications

  • G06T7/337Primary

    involving reference images or patches · CPC title

  • G06T17/00Primary

    Three-dimensional [3D] modelling for computer graphics · CPC title

  • Video; Image sequence · CPC title

  • from stereo images · CPC title

  • Camera pose · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9779508B2 cover?
A combination of three computational components may provide memory and computational efficiency while producing results with little latency, e.g., output can begin with the second frame of video being processed. Memory usage may be reduced by maintaining key frames of video and pose information for each frame of video. Additionally, only one global volumetric structure may be maintained for the…
Who is the assignee on this patent?
Microsoft Corp, Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06T7/337. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 03 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).