Conversion of monoscopic visual content to stereoscopic 3D
US-9111350-B1 · Aug 18, 2015 · US
US9779508B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9779508-B2 |
| Application number | US-201414226732-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 26, 2014 |
| Priority date | Mar 26, 2014 |
| Publication date | Oct 3, 2017 |
| Grant date | Oct 3, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A combination of three computational components may provide memory and computational efficiency while producing results with little latency, e.g., output can begin with the second frame of video being processed. Memory usage may be reduced by maintaining key frames of video and pose information for each frame of video. Additionally, only one global volumetric structure may be maintained for the frames of video being processed. To be computationally efficient, only depth information may be computed from each frame. Through fusion of multiple depth maps from different frames into a single volumetric structure, errors may average out over several frames, leading to a final output with high quality.
Opening claim text (preview).
What is claimed is: 1. A device comprising: a camera tracking circuit comprising an input that receives a sequence of images from a single camera at an input rate, the camera tracking circuit being operative to process each current image from the sequence of images at the input rate, and comprising a first output providing a pose for the current image, and a second output providing the current image for storing in a storage designating the current image as a key frame based on the pose for the image and poses for other stored key frames from the storage, and a third output providing an indication of one or more key frames other than the current image from the storage selected for the current image, wherein the pose, the indication of the one or more selected key frames and the designation of the current image as a key frame, are output successively at the input rate for each current image; a depth map estimation circuit having a first input that receives the current image from the sequence of images, and the indication of the one or more selected key frames for the current image, and a second input receiving the selected one or more key frames from the storage, and having an output providing a depth map for the current image based on the current image and the one or more selected key frames; and a volumetric fusion circuit having an input receiving the depth map for the current image and an output providing a three-dimensional model as a fusion of depth maps received from the depth map estimation circuit for the sequence of images; wherein the depth map estimation circuit and the volumetric fusion circuit further process inputs and provide outputs successively for each current image at the input rate. 2. The device of claim 1 , wherein the camera tracking circuit, the depth map estimation circuit, and the volumetric fusion circuit are implemented within a single general purpose integrated circuit. 3. The device of claim 1 , wherein the depth map for an image comprises a measure of depth for each pixel in the current image. 4. The device of claim 1 , wherein the pose for the current image comprises rotation and translation of the camera with respect to a fixed coordinate system. 5. The device of claim 1 , wherein the three-dimensional model is defined in a virtual volume, wherein an initial pose of the camera is defined at a point in the virtual volume. 6. The device of claim 1 , wherein the three-dimensional model is defined by representing a surface using a signed distance field in the virtual volume. 7. The device of claim 1 , wherein the volumetric fusion circuit outputs the three-dimensional model after processing as few as the first two images from the sequence of images. 8. A process for generating a three-dimensional model from a sequence of images from a single camera, the process comprising: receiving the sequence of images from the camera into memory at an input rate, to successively provide at the input rate a current image in memory for processing; successively processing the current image in the memory at the input rate by repeating the following steps for each current image: determining, using logic circuitry, a pose for the current image; determining, using logic circuitry, whether the current image is to be designated as a key frame based on the pose of the current image and poses for other stored key frames from storage; in response to determining that the current image is to be designated as a key frame, storing, the current image and the pose for the current image in the storage as a key frame; selecting, using logic circuitry, a key frame other than the current image, from among the other key frames stored in the storage from the sequence of images; computing, using logic circuitry, a depth map for the current image using the current image and the selected key frame for the current image; merging, in memory, the depth map computed for the current image into a volumetric representation of a scene represented in the sequence of images; and outputting, at the input rate, the volumetric representation of the scene from the memory. 9. The process of claim 8 , further comprising determining a scale of depth by a sensor. 10. The process of claim 8 , wherein the depth map determined for the current image comprises measures of depth for pixels in the current image. 11. The process of claim 8 , wherein the pose for the current image comprises rotation and translation of the camera with respect to a fixed coordinate system. 12. The process of claim 8 , wherein the three-dimensional model is defined in a virtual volume, wherein an initial pose of the camera is defined at a point in the virtual volume. 13. The process of claim 8 , wherein the three-dimensional model is defined by representing a surface using a signed distance field in the virtual volume. 14. The process of claim 8 , wherein outputting the three-dimensional model begins after processing as few as a first two images from the sequence of images. 15. A computer program product, comprising: a computer storage device; computer program instructions stored in the computer storage device that when read from the storage device and processed by a processor of a computer instruct the computer to perform a process for generating a three-dimensional model from a sequence of images from a single camera, the process comprising: receiving the sequence of images from the camera into memory at an input rate, to successively provide at the input rate a current image in memory for processing; successively processing the current image in the memory at the input rate by repeating the following steps for each current image: determining, using logic circuitry, a pose for the current image; determining, using logic circuitry, whether the current image is to be designated as a key frame based on the pose of the current image and poses for other stored key frames from storage; in response to determining that the current image is to be designated as a key frame, storing, the current image and the pose for the current image in the storage as a key frame; selecting, using logic circuitry, a key frame other than the current image, from among the other key frames stored in the storage from the sequence of images; computing, using logic circuitry, a depth map for the current image using the current image and the selected key frame for the current image; merging, in memory, the depth map computed for the current image into a volumetric representation of a scene represented in the sequence of images; and outputting, at the input rate, the volumetric representation of the scene from the memory. 16. The computer program product of claim 15 , wherein the processor is housed in a mobile device that incorporates the camera. 17. The computer program product of claim 15 , wherein the depth map for the current image comprises measures of depth for pixels in the current image. 18. The computer program product of claim 15 , wherein the pose for the current image comprises rotation and translation of the camera with respect to a fixed coordinate system. 19. The computer program product of claim 15 , wherein the three-dimensional model is defined in a virtual volume, wherein an initial pose of the camera is defined at a point in the virtual volume. 20. The computer program product of claim 15 , wherein the three-dimensional model is defined by representing a surface using a signed distance field in the virtual volume.
Related publications grouped by family.
Answers are generated from the same data shown on this page.