Systems and methods for self-supervised learning of camera intrinsic parameters from a sequence of images

US12175708B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12175708-B2
Application numberUS-202217692357-A
CountryUS
Kind codeB2
Filing dateMar 11, 2022
Priority dateSep 13, 2021
Publication dateDec 24, 2024
Grant dateDec 24, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods described herein relate to self-supervised learning of camera intrinsic parameters from a sequence of images. One embodiment produces a depth map from a current image frame captured by a camera; generates a point cloud from the depth map using a differentiable unprojection operation; produces a camera pose estimate from the current image frame and a context image frame; produces a warped point cloud based on the camera pose estimate; generates a warped image frame from the warped point cloud using a differentiable projection operation; compares the warped image frame with the context image frame to produce a self-supervised photometric loss; updates a set of estimated camera intrinsic parameters on a per-image-sequence basis using one or more gradients from the self-supervised photometric loss; and generates, based on a converged set of learned camera intrinsic parameters, a rectified image frame from an image frame captured by the camera.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for self-supervised learning of camera intrinsic parameters from a sequence of images, the system comprising: a processor; and a memory storing computer-readable instructions that, when executed by the processor, cause the processor to: produce a depth map from a current image frame captured by a camera; generate a point cloud from the depth map using a differentiable unprojection operation based on estimated camera intrinsic parameters of a parametric camera model; process the current image frame and a context image frame captured by the camera to produce a camera pose estimate; produce a warped point cloud based on the camera pose estimate; generate a warped image frame from the warped point cloud using a differentiable projection operation based on the estimated camera intrinsic parameters; compare the warped image frame with the context image frame to produce a self-supervised photometric loss; update the estimated camera intrinsic parameters on a per-image-sequence basis using a gradient from the self-supervised photometric loss; and generate, based on learned camera intrinsic parameters to which the estimated camera intrinsic parameters have converged according to predetermined convergence criteria, a rectified image frame that corrects distortion in an image frame captured by the camera, wherein the convergence criteria include at least updating until a change in the estimated camera intrinsic parameters from iteration to iteration falls below a predetermined threshold. 2. The system of claim 1 , wherein the computer-readable instructions include further instructions that, when executed by the processor, cause the processor to control operation of a robot based, at least in part, on the rectified image frame. 3. The system of claim 2 , wherein the robot is one of a manually driven vehicle, an autonomous vehicle, an indoor robot, and an aerial drone. 4. The system of claim 1 , wherein the parametric camera model is one of a pinhole camera model, a Unified Camera Model, an Extended Unified Camera Model, and a Double Sphere Camera Model. 5. The system of claim 1 , wherein the computer-readable instructions include further instructions that, when executed by the processor, cause the processor to learn the learned camera intrinsic parameters in response to a perturbation of the camera that changes one or more characteristics of the camera. 6. The system of claim 1 , wherein self-supervised depth learning and self-supervised pose learning serve as proxy tasks for learning the learned camera intrinsic parameters. 7. The system of claim 1 , wherein a geometry of the camera is one of perspective, fisheye, and catadioptric. 8. A non-transitory computer-readable medium for self-supervised learning of camera intrinsic parameters from a sequence of images and storing instructions that, when executed by a processor, cause the processor to: produce a depth map from a current image frame captured by a camera; generate a point cloud from the depth map using a differentiable unprojection operation based on estimated camera intrinsic parameters of a parametric camera model; process the current image frame and a context image frame captured by the camera to produce a camera pose estimate; produce a warped point cloud based on the camera pose estimate; generate a warped image frame from the warped point cloud using a differentiable projection operation based on the estimated camera intrinsic parameters; compare the warped image frame with the context image frame to produce a self-supervised photometric loss; update the estimated camera intrinsic parameters on a per-image-sequence basis using a gradient from the self-supervised photometric loss; and generate, based on learned camera intrinsic parameters to which the estimated camera intrinsic parameters have converged according to predetermined convergence criteria, a rectified image frame that corrects distortion in an image frame captured by the camera, wherein the convergence criteria include at least updating until a change in the estimated camera intrinsic parameters from iteration to iteration falls below a predetermined threshold. 9. The non-transitory computer-readable medium of claim 8 , wherein the instructions include further instructions that, when executed by the processor, cause the processor to control operation of a robot based, at least in part, on the rectified image frame. 10. The non-transitory computer-readable medium of claim 9 , wherein the robot is one of a manually driven vehicle, an autonomous vehicle, an indoor robot, and an aerial drone. 11. The non-transitory computer-readable medium of claim 8 , wherein the parametric camera model is one of a pinhole camera model, a Unified Camera Model, an Extended Unified Camera Model, and a Double Sphere Camera Model. 12. The non-transitory computer-readable medium of claim 8 , wherein the instructions include further instructions that, when executed by the processor, cause the processor to learn the learned camera intrinsic parameters in response to a perturbation of the camera that changes one or more characteristics of the camera. 13. The non-transitory computer-readable medium of claim 8 , wherein self-supervised depth learning and self-supervised pose learning serve as proxy tasks for learning the learned camera intrinsic parameters. 14. A method, comprising: producing a depth map from a current image frame captured by a camera; generating a point cloud from the depth map using a differentiable unprojection operation based on estimated camera intrinsic parameters of a parametric camera model; processing the current image frame and a context image frame captured by the camera to produce a camera pose estimate; producing a warped point cloud based on the camera pose estimate; generating a warped image frame from the warped point cloud using a differentiable projection operation based on the estimated camera intrinsic parameters; comparing the warped image frame with the context image frame to produce a self-supervised photometric loss; updating the estimated camera intrinsic parameters on a per-image-sequence basis using a gradient from the self-supervised photometric loss; and generating, based on learned camera intrinsic parameters to which the estimated camera intrinsic parameters have converged according to predetermined convergence criteria, a rectified image frame that corrects distortion in an image frame captured by the camera, wherein the convergence criteria include at least updating until a change in the estimated camera intrinsic parameters from iteration to iteration falls below a predetermined threshold. 15. The method of claim 14 , further comprising controlling operation of a robot based, at least in part, on the rectified image frame. 16. The method of claim 15 , wherein the robot is one of a manually driven vehicle, an autonomous vehicle, an indoor robot, and an aerial drone. 17. The method of claim 14 , wherein the parametric camera model is one of a pinhole camera model, a Unified Camera Model, an Extended Unified Camera Model, and a Double Sphere Camera Model. 18. The method of claim 14 , wherein the learned camera intrinsic parameters are learned in response to a perturbation of the camera that changes one or more characteristics of the camera. 19. The method of claim 14 , wherein self-supervised depth learning and self-supervised pose learning serve as proxy tasks for learning the learned camera intrinsic parameters. 20. The method of

Assignees

Inventors

Classifications

  • from positioning sensors located off-board the vehicle, e.g. from cameras · CPC title

  • Image warping, e.g. rearranging pixels individually · CPC title

  • G06T5/80Primary

    Geometric correction · CPC title

  • Image sensing, e.g. optical camera · CPC title

  • UAVs characterised by their flight controls · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12175708B2 cover?
Systems and methods described herein relate to self-supervised learning of camera intrinsic parameters from a sequence of images. One embodiment produces a depth map from a current image frame captured by a camera; generates a point cloud from the depth map using a differentiable unprojection operation; produces a camera pose estimate from the current image frame and a context image frame; prod…
Who is the assignee on this patent?
Toyota Res Inst Inc, Toyota Tech Institute At Chicago
What technology area does this patent fall under?
Primary CPC classification G06T5/80. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 24 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).