What technology area does this patent fall under?

Primary CPC classification G06T7/70. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jan 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Self-Supervised Attention Learning For Depth And Motion Estimation

US2022011778A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2022011778-A1
Application number	US-202016927270-A
Country	US
Kind code	A1
Filing date	Jul 13, 2020
Priority date	Jul 13, 2020
Publication date	Jan 13, 2022
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system includes: a depth module including an encoder and a decoder and configured to: receive a first image from a first time from a camera; and based on the first image, generate a depth map including depths between the camera and objects in the first image; a pose module configured to: generate a first pose of the camera based on the first image; generate a second pose of the camera for a second time based on a second image; and generate a third pose of the camera for a third time based on a third image; and a motion module configured to: determine a first motion of the camera between the second and first times based on the first and second poses; and determine a second motion of the camera between the second and third times based on the second and third poses.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system, comprising: a depth module including an encoder and a decoder and configured to: receive a first image from a first time from a camera; and based on the first image and using the encoder and the decoder, generate a depth map including depths between the camera and objects in the first image; a pose module configured to: generate a first pose of the camera based on the first image; generate a second pose of the camera for a second time based on a second image received from the camera before the first image; and generate a third pose of the camera for a third time based on a third image received from the camera after the first image; and a motion module configured to: determine a first motion of the camera between the second time and the first time based on the first pose and the second pose; and determine a second motion of the camera between the second time and the third time based on the second pose and the third pose. 2 . A vehicle, comprising: the system of claim 1 ; a propulsion device configured to propel the vehicle; and a control module configured to actuate the propulsion device based on the depth map. 3 . The vehicle of claim 2 , wherein the vehicle includes the camera and does not include any other cameras. 4 . The vehicle of claim 2 , wherein the vehicle does not include any radars, any sonar sensors, any laser sensors, or any light detection and ranging (LIDAR) sensors. 5 . A vehicle, comprising: the system of claim 1 ; a propulsion device configured to propel the vehicle; and a control module configured to actuate the propulsion device based on at least one of: the first motion; and the second motion. 6 . The system of claim 1 wherein the first, second, and third poses are 6 degree of freedom poses. 7 . The system of claim 1 wherein the depth module includes attention mechanisms configured to, based on the first image, generate an attention map including attention coefficients indicative of amounts of attention to attribute to the objects in the first image. 8 . The system of claim 7 wherein the attention mechanisms include attention gates. 9 . The system of claim 7 wherein the decoder includes the attention mechanisms. 10 . The system of claim 9 wherein the encoder does not include any attention mechanisms. 11 . The system of claim 9 wherein the decoder includes decoder layers and the attention mechanisms are interleaved with the decoder layers. 12 . The system of claim 7 further comprising: a first reconstruction module configured to reconstruct the second image using the attention map to produce a reconstructed second image; a second reconstruction module configured to reconstruct the third image using the attention map to produce a reconstructed third image; and a training module configured to, based on at least one of the reconstructed second image and the reconstructed third image, selectively adjust at least one parameter of at least one of depth module, the pose module, and the motion module. 13 . The system of claim 12 wherein the training module is configured to selectively adjust the at least one parameter based on the reconstructed second image, the reconstructed third image, the second image, and the third image. 14 . The system of claim 13 wherein the training module is configured to selectively adjust the at least one parameter based on: a first difference between the reconstructed second image and the second image; and a second difference between the reconstructed third image and the third image. 15 . The system of claim 12 wherein the training module is configured to jointly train the depth module, the pose module, and the motion module. 16 . The system of claim 12 wherein: the first reconstruction module is configured to reconstruct the second image using an image warping algorithm and the attention map; and the second reconstruction module is configured to reconstruct the third image using the image warping algorithm and the attention map. 17 . The system of claim 16 wherein the image warping algorithm includes an inverse image warping algorithm. 18 . The system of claim 1 wherein the pose module is configured to generate the first, second, and third poses using a PoseNet algorithm. 19 . The system of claim 1 wherein the depth module includes a DispNet encoder-decoder network. 20 . A method, comprising: receiving a first image from a first time from a camera; based on the first image, generating a depth map including depths between the camera and objects in the first image; generating a first pose of the camera based on the first image; generating a second pose of the camera for a second time based on a second image received from the camera before the first image; and generating a third pose of the camera for a third time based on a third image received from the camera after the first image; determining a first motion of the camera between the second time and the first time based on the first pose and the second pose; and determining a second motion of the camera between the second time and the third time based on the second pose and the third pose. 21 . A system, comprising: one or more processors; and memory including code that, when executed by the one or more processors, perform functions including: receiving a first image from a first time from a camera; based on the first image, generating a depth map including depths between the camera and objects in the first image; generating a first pose of the camera based on the first image; generating a second pose of the camera for a second time based on a second image received from the camera before the first image; and generating a third pose of the camera for a third time based on a third image received from the camera after the first image; determining a first motion of the camera between the second time and the first time based on the first pose and the second pose; and determining a second motion of the camera between the second time and the third time based on the second pose and the third pose. 22 . A system, comprising: a first means for: receiving a first image from a first time from a camera; and based on the first image, generating a depth map including depths between the camera and objects in the first image; a second means for: generating a first pose of the camera based on the first image; generating a second pose of the camera for a second time based on a second image received from the camera before the first image; and generating a third pose of the camera for a third time based on a third image received from the camera after the first image; and a third means for: determining a first motion of the camera between the second time and the first time based on the first pose and the second pose; and determining a second motion of the camera between the second time and the third time based on the second pose and the third pose.

Assignees

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/0455
Auto-encoder networks; Encoder-decoder networks · CPC title
G06N3/0895
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06T7/50
Depth or shape recovery · CPC title

Patent family

Related publications grouped by family.

View patent family 79173564

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022011778A1 cover?: A system includes: a depth module including an encoder and a decoder and configured to: receive a first image from a first time from a camera; and based on the first image, generate a depth map including depths between the camera and objects in the first image; a pose module configured to: generate a first pose of the camera based on the first image; generate a second pose of the camera for a s…
Who is the assignee on this patent?: Naver Corp, Naver Labs Corp
What technology area does this patent fall under?: Primary CPC classification G06T7/70. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jan 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Depth estimation based on ego-motion estimation and residual flow estimation

Unsupervised depth prediction neural networks

Method for calibrating a multi-sensor system using an artificial neural network

Unsupervised learning of image depth and ego-motion prediction neural networks

Method and apparatus with ego motion information estimation

Frequently asked questions