Reconstructing three-dimensional scenes portrayed in digital images utilizing point cloud machine-learning models
US-2022277514-A1 · Sep 1, 2022 · US
US12175708B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12175708-B2 |
| Application number | US-202217692357-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 11, 2022 |
| Priority date | Sep 13, 2021 |
| Publication date | Dec 24, 2024 |
| Grant date | Dec 24, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods described herein relate to self-supervised learning of camera intrinsic parameters from a sequence of images. One embodiment produces a depth map from a current image frame captured by a camera; generates a point cloud from the depth map using a differentiable unprojection operation; produces a camera pose estimate from the current image frame and a context image frame; produces a warped point cloud based on the camera pose estimate; generates a warped image frame from the warped point cloud using a differentiable projection operation; compares the warped image frame with the context image frame to produce a self-supervised photometric loss; updates a set of estimated camera intrinsic parameters on a per-image-sequence basis using one or more gradients from the self-supervised photometric loss; and generates, based on a converged set of learned camera intrinsic parameters, a rectified image frame from an image frame captured by the camera.
Opening claim text (preview).
What is claimed is: 1. A system for self-supervised learning of camera intrinsic parameters from a sequence of images, the system comprising: a processor; and a memory storing computer-readable instructions that, when executed by the processor, cause the processor to: produce a depth map from a current image frame captured by a camera; generate a point cloud from the depth map using a differentiable unprojection operation based on estimated camera intrinsic parameters of a parametric camera model; process the current image frame and a context image frame captured by the camera to produce a camera pose estimate; produce a warped point cloud based on the camera pose estimate; generate a warped image frame from the warped point cloud using a differentiable projection operation based on the estimated camera intrinsic parameters; compare the warped image frame with the context image frame to produce a self-supervised photometric loss; update the estimated camera intrinsic parameters on a per-image-sequence basis using a gradient from the self-supervised photometric loss; and generate, based on learned camera intrinsic parameters to which the estimated camera intrinsic parameters have converged according to predetermined convergence criteria, a rectified image frame that corrects distortion in an image frame captured by the camera, wherein the convergence criteria include at least updating until a change in the estimated camera intrinsic parameters from iteration to iteration falls below a predetermined threshold. 2. The system of claim 1 , wherein the computer-readable instructions include further instructions that, when executed by the processor, cause the processor to control operation of a robot based, at least in part, on the rectified image frame. 3. The system of claim 2 , wherein the robot is one of a manually driven vehicle, an autonomous vehicle, an indoor robot, and an aerial drone. 4. The system of claim 1 , wherein the parametric camera model is one of a pinhole camera model, a Unified Camera Model, an Extended Unified Camera Model, and a Double Sphere Camera Model. 5. The system of claim 1 , wherein the computer-readable instructions include further instructions that, when executed by the processor, cause the processor to learn the learned camera intrinsic parameters in response to a perturbation of the camera that changes one or more characteristics of the camera. 6. The system of claim 1 , wherein self-supervised depth learning and self-supervised pose learning serve as proxy tasks for learning the learned camera intrinsic parameters. 7. The system of claim 1 , wherein a geometry of the camera is one of perspective, fisheye, and catadioptric. 8. A non-transitory computer-readable medium for self-supervised learning of camera intrinsic parameters from a sequence of images and storing instructions that, when executed by a processor, cause the processor to: produce a depth map from a current image frame captured by a camera; generate a point cloud from the depth map using a differentiable unprojection operation based on estimated camera intrinsic parameters of a parametric camera model; process the current image frame and a context image frame captured by the camera to produce a camera pose estimate; produce a warped point cloud based on the camera pose estimate; generate a warped image frame from the warped point cloud using a differentiable projection operation based on the estimated camera intrinsic parameters; compare the warped image frame with the context image frame to produce a self-supervised photometric loss; update the estimated camera intrinsic parameters on a per-image-sequence basis using a gradient from the self-supervised photometric loss; and generate, based on learned camera intrinsic parameters to which the estimated camera intrinsic parameters have converged according to predetermined convergence criteria, a rectified image frame that corrects distortion in an image frame captured by the camera, wherein the convergence criteria include at least updating until a change in the estimated camera intrinsic parameters from iteration to iteration falls below a predetermined threshold. 9. The non-transitory computer-readable medium of claim 8 , wherein the instructions include further instructions that, when executed by the processor, cause the processor to control operation of a robot based, at least in part, on the rectified image frame. 10. The non-transitory computer-readable medium of claim 9 , wherein the robot is one of a manually driven vehicle, an autonomous vehicle, an indoor robot, and an aerial drone. 11. The non-transitory computer-readable medium of claim 8 , wherein the parametric camera model is one of a pinhole camera model, a Unified Camera Model, an Extended Unified Camera Model, and a Double Sphere Camera Model. 12. The non-transitory computer-readable medium of claim 8 , wherein the instructions include further instructions that, when executed by the processor, cause the processor to learn the learned camera intrinsic parameters in response to a perturbation of the camera that changes one or more characteristics of the camera. 13. The non-transitory computer-readable medium of claim 8 , wherein self-supervised depth learning and self-supervised pose learning serve as proxy tasks for learning the learned camera intrinsic parameters. 14. A method, comprising: producing a depth map from a current image frame captured by a camera; generating a point cloud from the depth map using a differentiable unprojection operation based on estimated camera intrinsic parameters of a parametric camera model; processing the current image frame and a context image frame captured by the camera to produce a camera pose estimate; producing a warped point cloud based on the camera pose estimate; generating a warped image frame from the warped point cloud using a differentiable projection operation based on the estimated camera intrinsic parameters; comparing the warped image frame with the context image frame to produce a self-supervised photometric loss; updating the estimated camera intrinsic parameters on a per-image-sequence basis using a gradient from the self-supervised photometric loss; and generating, based on learned camera intrinsic parameters to which the estimated camera intrinsic parameters have converged according to predetermined convergence criteria, a rectified image frame that corrects distortion in an image frame captured by the camera, wherein the convergence criteria include at least updating until a change in the estimated camera intrinsic parameters from iteration to iteration falls below a predetermined threshold. 15. The method of claim 14 , further comprising controlling operation of a robot based, at least in part, on the rectified image frame. 16. The method of claim 15 , wherein the robot is one of a manually driven vehicle, an autonomous vehicle, an indoor robot, and an aerial drone. 17. The method of claim 14 , wherein the parametric camera model is one of a pinhole camera model, a Unified Camera Model, an Extended Unified Camera Model, and a Double Sphere Camera Model. 18. The method of claim 14 , wherein the learned camera intrinsic parameters are learned in response to a perturbation of the camera that changes one or more characteristics of the camera. 19. The method of claim 14 , wherein self-supervised depth learning and self-supervised pose learning serve as proxy tasks for learning the learned camera intrinsic parameters. 20. The method of
from positioning sensors located off-board the vehicle, e.g. from cameras · CPC title
Image warping, e.g. rearranging pixels individually · CPC title
Geometric correction · CPC title
Image sensing, e.g. optical camera · CPC title
UAVs characterised by their flight controls · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.