Depth Map Calculation in a Stereo Camera System
US-2017069097-A1 · Mar 9, 2017 · US
US11113832B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11113832-B2 |
| Application number | US-201716759808-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 3, 2017 |
| Priority date | Nov 3, 2017 |
| Publication date | Sep 7, 2021 |
| Grant date | Sep 7, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Example embodiments allow for training of artificial neural networks (ANNs) to generate depth maps based on images. The ANNs are trained based on a plurality of sets of images, where each set of images represents a single scene and the images in such a set of images differ with respect to image aperture and/or focal distance. An untrained ANN generates a depth map based on one or more images in a set of images. This depth map is used to generate, using the image(s) in the set, a predicted image that corresponds, with respect to image aperture and/or focal distance, to one of the images in the set. Differences between the predicted image and the corresponding image are used to update the ANN. ANNs tramed in this manner are especially suited for generating depth maps used to perform simulated image blur on small-aperture images.
Opening claim text (preview).
We claim: 1. A method comprising: obtaining a plurality of images, wherein a set of at least two images of the plurality of images describe a common scene, wherein the set of at least two images comprises a source image having a first depth-of-field and a target image having a second depth-of-field, wherein the second depth-of-field is less than the first depth-of-field; determining, using an artificial neural network, a depth map for the common scene based on the source image; determining, based on the determined depth map for the common scene, a predicted image based on the source image such that the predicted image has a depth-of-field corresponding to the second depth-of-field; determining a difference between the predicted image and the target image; updating the artificial neural network based on the determined difference; obtaining an image of a scene of interest; and using the updated artificial neural network to generate a depth map for the scene of interest based on the image of the scene of interest. 2. The method of claim 1 , wherein determining, based on the first depth map, a predicted image based on the source image comprises using a differentiable aperture rendering function. 3. The method of claim 2 , wherein the using the differentiable aperture rendering function to determine the predicted image comprises: determining an estimated light field based on the source image; and based on the first depth map, shearing and projecting the estimated light field to determine the predicted image. 4. The method of claim 1 , wherein determining, using the artificial neural network, the depth map for the common scene based on the source image comprises: determining, using the artificial neural network, a set of depth values based on the source image, wherein each depth value of the set of depth values corresponds to a respective location within the source image; and upsampling the set of depth values to generate the depth map for the common scene. 5. The method of claim 4 , wherein upsampling the set of depth values to generate the depth map for the common scene comprises using a bilateral method to upsample the set of depth values based on the source image. 6. The method of claim 1 , wherein the target image is a first target image, wherein the predicted image is a first predicted image, wherein the determined difference is a first determined difference, wherein the set of at least two images further comprises a second target image having a third depth-of-field, wherein the third depth-of-field differs from the second depth-of-field and is less than the first depth-of-field, the method further comprising: determining, based on the determined depth map for the common scene, a second predicted image based on the source image such that the second predicted image has a depth-of-field corresponding to the third depth-of-field; and determining a second difference between the second predicted image and the second target image, wherein updating the artificial neural network comprises updating the neural network based on the second difference. 7. The method of claim 1 , wherein obtaining the set of at least two images that describe the common scene comprises: capturing, using a light field camera, a light field from the common scene; generating the source image based on the captured light field such that the source image has the first depth-of-field; and generating the target image based on the captured light field such that the target image has the second depth-of-field. 8. The method of claim 1 , wherein obtaining the set of at least two images that describe the common scene comprises: capturing, using a camera set to a first aperture setting, the source image; and capturing, using the camera set to a second aperture setting, the target image, wherein the second aperture setting is wider than the first aperture setting. 9. The method of claim 1 , wherein obtaining the image of the scene of interest comprises operating a cell phone to capture the image of the scene of interest, the method further comprising: transmitting, from a server to the cell phone, an indication of the updated artificial neural network, wherein using the updated artificial neural network to generate the depth map for the scene of interest based on the image of the scene of interest comprises a processor of the cell phone using the updated artificial neural network to generate the depth map for the scene of interest. 10. The method of claim 1 , further comprising: performing image processing on the image of the scene of interest based on the determined depth map for the scene of interest. 11. The method of claim 1 , wherein the artificial neural network is a convolutional neural network. 12. A method comprising: obtaining, by a system, a plurality of images, wherein a set of at least two images of the plurality of images describe a common scene, wherein the set of at least two images comprises a source image having a first depth-of-field and a target image having a second depth-of-field, wherein the second depth-of-field is less than the first depth-of-field; determining, by the system using an artificial neural network, a depth map for the common scene based on the source image; determining, by the system based on the determined depth map for the common scene, a predicted image based on the source image such that the predicted image has a depth-of-field corresponding to the second depth-of-field; determining, by the system, a difference between the predicted image and the target image; updating, by the system, the artificial neural network based on the determined difference; and transmitting, from the system to a remote device, an indication of the updated artificial neural network. 13. The method of claim 12 , wherein determining, based on the first depth map, a predicted image based on the source image comprises using a differentiable aperture rendering function. 14. The method of claim 13 , wherein the using the differentiable aperture rendering function to determine the predicted image comprises: determining an estimated light field based on the source image; and based on the first depth map, shearing and projecting the estimated light field to determine the predicted image. 15. The method of claim 12 , wherein the target image is a first target image, wherein the predicted image is a first predicted image, wherein the determined difference is a first determined difference, wherein the set of at least two images further comprises a second target image having a third depth-of-field, wherein the third depth-of-field differs from the second depth-of-field and is less than the first depth-of-field, the method further comprising: determining, based on the determined depth map for the common scene, a second predicted image based on the source image such that the second predicted image has a depth-of-field corresponding to the third depth-of-field; and determining a second difference between the second predicted image and the second target image, wherein updating the artificial neural network comprises updating the neural network based on the second difference. 16. The method of claim 12 , wherein the artificial neural network is a convolutional neural network. 17. A method comprising: obtaining a plurality of images of a scene, wherein the images each have a shallow depth-of-field and differ with respect to focal distance; determining, using an artificial neural network, a depth map for the scene based on the plurality of images; determining, based on the plurality
from light fields, e.g. from plenoptic cameras · CPC title
using electromagnetic waves other than radio waves · CPC title
Validation; Performance evaluation; Active pattern learning techniques · CPC title
Terrestrial scenes (scenes under surveillance with static cameras G06V20/52; scenes perceived from the exterior of a vehicle G06V20/56; scenes perceived from the interior of a vehicle G06V20/59) · CPC title
Training; Learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.