Visual inertial odometry with machine learning depth

US12366590B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12366590-B2
Application numberUS-202217648572-A
CountryUS
Kind codeB2
Filing dateJan 21, 2022
Priority dateJan 21, 2022
Publication dateJul 22, 2025
Grant dateJul 22, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed is a method including receiving a depth map estimated using data based on image and data received from a movement sensor as input, generating an alignment parameter based on the depth map, adding the alignment parameter to a pre-calibration state to define a user operational calibration state, generating scale parameters and shift parameters based on features associated with the data received from the image and movement sensor, and calibrating the image and movement sensor based on the user operational calibration state, the scale parameters and the shift parameters.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a depth data estimated using data generated based on a single image and data received from a movement sensor as input to a function applied to the data, the depth data including a smaller number of depth values than a number of values corresponding to a pixel of the image; generating an alignment parameter based on the depth data; defining a second calibration state based on a first calibration state and the alignment parameter, wherein the first calibration state is associated with a manufacture of an image and movement sensor and the second calibration state is associated with a user operation of the image and movement sensor; generating scale parameters and shift parameters based on features associated with the data received from the image and movement sensor; and adjusting the image and movement sensor based on the second calibration state, the scale parameters and the shift parameters. 2. The method of claim 1 , further comprising: generating a gradient data based on the depth data and the image data, wherein the alignment parameter is generated based on the gradient data. 3. The method of claim 1 , wherein the depth data is estimated using a neural network. 4. The method of claim 1 , wherein the adjusting of the image and movement sensor includes spatially aligning and temporally aligning data generated by an image sensor with data generated by a movement sensor. 5. The method of claim 4 , wherein the second calibration state is a visual inertial (VI) calibration state, the movement sensor is an inertial measurement unit (IMU) sensor, and the VI calibration state is generated based on IMU sensor data, image data generated by the image sensor, and optical flow measurements associated with the image data. 6. The method of claim 1 , further comprising: one of estimating, by a neural network, a depth based on the single image or a set of images and movement sensor data or generating, using an image processing operation, the depth based on one of the image or the set of images; storing the depth in a memory; and selecting the stored depth as the received depth data. 7. The method of claim 6 , further comprising: generating gravity aligned image data by rotating image data based on IMU sensor data, wherein the estimating of the depth uses the gravity aligned image data as input to the neural network and the gravity aligned image data is rotated before the depth is estimated; or the generating of the depth uses the gravity aligned image data as input to the image processing operation and the generating of the depth includes rotating the gravity aligned image data. 8. The method of claim 7 , further comprising: generating an image plane based on the image data; determining a gravity vector based on the IMU sensor data and the image plane; determining a ground plane vector in an opposite direction of the gravity vector; generating gravity vector parameters based on the gravity vector; generating a ground plane based on the ground plane vector; predicting, by the neural network, a surface normal data including a plurality of pixels representing a surface normal direction in a camera frame; determining a triplet of points in the image plane based on the features associated with the image data; generating a frame plane based on a projection of the triplet of points onto the ground plane; determining a frame plane normal bounded by the surface normal data; generating triplet normal parameters based on the frame plane normal; and modifying the scale parameters and the shift parameters based on the gravity vector parameters and the triplet normal parameters. 9. The method of claim 1 , wherein the depth data is one of a plurality of depth data associated with a plurality of image frames, the method further comprising: performing outlier rejection based on a residual error across the plurality of depth data. 10. A visual inertial (VI) system comprising: at least one processor; and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the VI system to: receive a depth data estimated using data generated based a single image and data received from a movement sensor as input to a function applied to the data, the depth data including a smaller number of depth values than a number of values corresponding to a pixel of the image; generate an alignment parameter based on the depth data; define a second calibration state based on a first calibration state and the alignment parameter, wherein the first calibration state is associated with a manufacture of an image and movement sensor and the second calibration state is associated with a user operation of the image and movement sensor; generate scale parameters and shift parameters based on features associated with the data received from the image and movement sensor; and adjust the image and movement sensor based on the second calibration state, the scale parameters and the shift parameters. 11. The VI system of claim 10 , wherein the computer program code further causes the VI system to: generate a gradient data based on the depth data and the image data, wherein the alignment parameter is generated based on the gradient data. 12. The VI system of claim 10 , wherein the depth data is estimated using a neural network. 13. The VI system of claim 10 , wherein the adjusting of the image and movement sensor includes spatially aligning and temporally aligning data generated by an image sensor with data generated by a movement sensor. 14. The VI system of claim 13 , wherein The second calibration state is a visual inertial (VI) calibration state, the movement sensor is an inertial measurement unit (IMU) sensor, and the VI calibration state is generated based on IMU sensor data, image data generated by the image sensor, and optical flow measurements associated with the image data. 15. The VI system of claim 10 , wherein the computer program code further causes the VI system to: one of estimate, by a neural network, a depth based on the single image or a set of images and movement sensor data or generate, using an image processing operation, the depth based on one of the image or the set of images; store the depth in a memory; and select the stored depth as the received depth map data. 16. The VI system of claim 15 , wherein the computer program code further causes the VI system to: generate gravity aligned image data by rotating image data based on IMU sensor data, wherein the estimating of the depth uses the gravity aligned image data as input to the neural network and the gravity aligned image data is rotated before the depth is estimated; or the generating of the depth uses the gravity aligned image data as input to the image processing operation and the generating of the depth includes rotating the gravity aligned image data. 17. The VI system of claim 16 , wherein the computer program code further causes the VI system to: generate an image plane based on the image data; determine a gravity vector based on the IMU sensor data and the image plane; determine a ground plane vector in an opposite direction of the gravity vector; generate gravity vector parameters based on the gravity vector; generate a ground plane based on the ground plane vector; predict, by the neural network, a surface normal data including a plurality of pixels representing a surface normal direction in a camera frame; determine a triplet

Assignees

Inventors

Classifications

  • Artificial neural networks [ANN] · CPC title

  • Analysis of motion (motion estimation for coding, decoding, compressing or decompressing digital video signals H04N19/43, H04N19/51) · CPC title

  • with conversion into electric or magnetic values · CPC title

  • Indicating or recording presence, absence, or direction, of movement (electric switches H01H; counting moving objects G06M7/00) · CPC title

  • using feature-based methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12366590B2 cover?
Disclosed is a method including receiving a depth map estimated using data based on image and data received from a movement sensor as input, generating an alignment parameter based on the depth map, adding the alignment parameter to a pre-calibration state to define a user operational calibration state, generating scale parameters and shift parameters based on features associated with the data …
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06T7/80. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 22 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).