Image processing apparatus and image processing method
US-2019230280-A1 · Jul 25, 2019 · US
US11044398B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11044398-B2 |
| Application number | US-201916583176-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 25, 2019 |
| Priority date | Sep 28, 2018 |
| Publication date | Jun 22, 2021 |
| Grant date | Jun 22, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A light field panorama system in which a user holding a mobile device performs a gesture to capture images of a scene from different positions. Additional information, for example position and orientation information, may also be captured. The images and information may be processed to determine metadata including the relative positions of the images and depth information for the images. The images and metadata may be stored as a light field panorama. The light field panorama may be processed by a rendering engine to render different 3D views of the scene to allow a viewer to explore the scene from different positions and angles with six degrees of freedom. Using a rendering and viewing system such as a mobile device or head-mounted display, the viewer may see behind or over objects in the scene, zoom in or out on the scene, or view different parts of the scene.
Opening claim text (preview).
What is claimed is: 1. A system, comprising: a mobile device comprising a camera, wherein the mobile device is configured to capture a plurality of images of a scene from different positions during a gesture made with the mobile device; one or more processors that implement a processing pipeline configured to: determine relative camera positions of the images with respect to the scene; compute depth information for the images based at least in part on the determined relative camera positions of the images; and generate a three-dimensional light field panorama of the scene based on the images and the depth information. 2. The system as recited in claim 1 , wherein, to determine camera positions of the images, the processing pipeline is configured to: identify feature points in the images; correlate the feature points across the images; and compute the camera positions of the images based at least in part on relative disparity between the feature points in different images. 3. The system as recited in claim 1 , wherein the mobile device is configured to capture motion and position data for the images captured during the gesture, and wherein the processing pipeline is configured to compute the camera positions for the images based at least in part on the motion and position data captured for the images. 4. The system as recited in claim 1 , wherein, to compute depth information for the images based at least in part on the camera position of the images, the processing pipeline is configured to: determine pixel disparity between the images; determine distance between the images; and determine the depth information for each image based at least in part on the pixel disparity and the distance between the images. 5. The system as recited in claim 1 , further comprising: a viewing device comprising at least one display screen; and one or more processors that implement a rendering engine configured to iteratively perform: determine a current perspective of the device with respect to the scene captured in the light field panorama based at least in part on a current position of the viewing device; and render a view of the scene captured in the light field panorama from the current perspective for display on the at least one display screen of the viewing device. 6. The system as recited in claim 5 , wherein the viewing device is one of a mobile device, a head-mounted display, a television, a computer monitor, or a display wall. 7. The system as recited in claim 1 , wherein the mobile device is one of a smartphone, a tablet device, or a pad device. 8. The system as recited in claim 1 , wherein the light-field panorama comprises: a primary layer; and one or more occlusion layers; wherein each layer includes one or more images, wherein each image comprises: pixel data for the image and depth data for the image; and metadata including position information for the image with respect to the scene and other ones of the images. 9. The system as recited in claim 1 , wherein the processing pipeline is implemented on the mobile device. 10. The system as recited in claim 1 , wherein the processing pipeline is implemented on one or more devices of a network-based service. 11. The system as recited in claim 1 , wherein the processing pipeline is distributed between the mobile device and a network-based service. 12. A method, comprising: capturing, by a camera of a mobile device during a gesture made with the mobile device, a plurality of images of a scene from different positions; performing, by a processing pipeline implemented by one or more processors: determining relative camera positions of the images with respect to the scene; computing depth information for the images based at least in part on the determined relative camera positions of the images; and generating a three-dimensional light field panorama of the scene based on the images and the depth information. 13. The method as recited in claim 12 , wherein determining camera positions of the images comprises: identifying feature points in the images; correlating the feature points across the images; and computing the camera positions for the images based at least in part on relative disparity between the feature points in different images. 14. The method as recited in claim 12 , further comprising: capturing motion and position data for the images captured during the gesture; and computing the camera positions for the images based at least in part on the motion and position data captured for the images. 15. The method as recited in claim 12 , wherein computing depth information for the images based at least in part on the position of the images comprises: determining pixel disparity between the images; determining distance between the images; and determining the depth information for each image based at least in part on the pixel disparity and the distance between the images. 16. The method as recited in claim 12 , further comprising performing, by a rendering engine implemented by one or more processors: determining current perspectives of a viewer with respect to the scene captured in the light field panorama based at least in part on current positions of a viewing device as the viewing device is translated or rotated by the viewer; and rendering views of the scene captured in the light field panorama from the current perspective for display on at least one display screen of the viewing device. 17. The method as recited in claim 16 , wherein the viewing device is one of a mobile device, a head-mounted display, a television, a computer monitor, or a display wall. 18. The method as recited in claim 12 , wherein the mobile device is one of a smartphone, a tablet device, or a pad device. 19. The method as recited in claim 12 , wherein the light-field panorama comprises: a primary layer; and one or more occlusion layers; wherein each layer includes one or more images, wherein each image comprises: pixel data for the image and depth data for the image; and metadata including position information for the image with respect to the scene and other ones of the images. 20. A system, comprising: a mobile device comprising a camera and one or more processors that implement a camera application configured to: capture a plurality of images of a scene from different positions during a gesture made with the mobile device; capture camera position and orientation data for the images from motion and position sensors of the mobile device; one or more processors that implement a real-time engine configured to, during capture of the images: determine low-resolution depth data for the images; integrate the images into a model of the scene, wherein the model is a low-resolution representation of the scene being captured, wherein each image in the model includes high-resolution pixel data, the determined low-resolution depth data, and the camera position and orientation data for the image; convert the low-resolution representation of the scene into visual feedback; and provide the visual feedback to the camera application for presentation as a preview via a user interface on the mobile device. 21. The system as recited in claim 20 , further comprising one or more processors that implement a post-processing engine configured to, after capture of the images: receive the images from the model; upscale the depth data for the images to high-resolution; perform global bundle adjustment of the images at
for achieving an enlarged field of view, e.g. panoramic image capture · CPC title
Determining parameters from multiple pictures (depth or shape recovery from multiple images G06T7/55; stereo camera calibration G06T7/85) · CPC title
using two or more images, e.g. averaging or subtraction · CPC title
Image acquisition · CPC title
involving computational photography · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.