What technology area does this patent fall under?

Primary CPC classification G06T7/579. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 30 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Depth determination for images captured with a moving camera and representing moving features

US11663733B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11663733-B2
Application number	US-202217656165-A
Country	US
Kind code	B2
Filing date	Mar 23, 2022
Priority date	Sep 20, 2019
Publication date	May 30, 2023
Grant date	May 30, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method includes obtaining a reference image and a target image each representing an environment containing moving features and static features. The method also includes determining an object mask configured to mask out the moving features and preserves the static features in the target image. The method additionally includes determining, based on motion parallax between the reference image and the target image, a static depth image representing depth values of the static features in the target image. The method further includes generating, by way of a machine learning model, a dynamic depth image representing depth values of both the static features and the moving features in the target image. The model is trained to generate the dynamic depth image by determining depth values of at least the moving features based on the target image, the object mask, and the static depth image.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: obtaining a reference image and a target image each representing an environment containing a moving feature and a static feature, wherein the reference image has been captured by a camera at a first time and the target image has been captured by the camera at a second time different from the first time; determining an object mask configured to (i) mask out the moving feature in the target image and (ii) preserve the static feature in the target image; determining, based on one or more of the reference image or the target image, a static depth image that represents depth values of the static features in the target image; and generating, using a machine learning (ML) model and based on (i) the static depth image, (ii) the object mask, and (iii) one or more of the target image or the reference image, a dynamic depth image that represents depth values of both the static features and the moving features in the target image. 2. The computer-implemented method of claim 1 , wherein determining the static depth image comprises: processing the one or more of the reference image or the target image by at least one of: (i) a multi view stereo (MVS) algorithm, (ii) a structure from motion (SfM) algorithm, or (iii) a motion parallax algorithm. 3. The computer-implemented method of claim 1 , wherein the object mask comprises a binary image that assigns a first value to a region of the target image that contains the moving feature and a second value to a region of the target image that contains the static feature. 4. The computer-implemented method of claim 1 , wherein the ML model has been trained using a training process comprising: obtaining a video captured by a camera moving through a training environment that contains (i) a static training feature and (ii) a movable training feature that is fixed in a respective pose while being filmed by the camera; determining a supervised depth image of a scene represented by the video, wherein the supervised depth image is determined based on (i) a training reference image from the video that represent the scene from a first point of view and (ii) a training target image from the video that represent the scene from a second point of view different from the first point of view; and determining one or more parameters of the ML model based on the supervised depth image. 5. The computer-implemented method of claim 4 , wherein determining the one or more parameters of the ML model comprises: determining a training object mask configured to (i) mask out the movable training feature in the training target image and (ii) preserve the static training feature in the training target image; determining, based on at least one of the training reference image and the training target image, a training static depth image that represents depth values of the static training feature in the training target image; and generating, using the ML model and based on (i) the training static depth image, (ii) the training object mask, and (iii) one or more of the training target image or the training reference image, a training dynamic depth image that represents depth values of both the static training feature and the movable training feature in the training target image; determining a difference between the training dynamic depth image and the supervised depth image; and adjusting the one or more parameters of the ML model based on the difference. 6. The computer-implemented method of claim 4 , wherein the movable training feature comprise a first human, wherein the moving feature comprises a second human, and wherein the object mask comprises a human-shaped region. 7. The computer-implemented method of claim 1 , wherein the camera is moving through the environment while capturing the reference image and the target image, wherein the static feature maintains a fixed pose within the environment between the first time and the second time, and wherein a pose of the moving feature within the environment changes between the first time and the second time. 8. The computer-implemented method of claim 1 , wherein determining the object mask comprises processing the target image by way of an object instance segmentation algorithm configured to identify the moving feature within the target image and generate a mask region representing the moving feature. 9. The computer-implemented method of claim 1 , wherein determining the static depth image comprises: determining an optical flow image based on the reference image and the target image; determining a camera pose associated with the target image; and determining a motion parallax depth image that represents depth values of both the static feature and the moving feature in the target image based on the optical flow image and the camera pose. 10. The computer-implemented method of claim 1 , further comprising: determining a confidence map that corresponds to the static depth image and indicates, for each respective pixel within the static depth image, a confidence value associated with the depth value of the respective pixel, wherein the ML model is configured to generate the dynamic depth image further based on the confidence map. 11. The computer-implemented method of claim 10 , further comprising: based on the confidence map and prior to providing the static depth image as input to the ML model, removing, from the static depth image, pixels associated with corresponding confidence values that are below a threshold confidence value. 12. The computer-implemented method of claim 10 , wherein determining the confidence map comprises: determining a left-right consistency between (i) a forward optical flow field and (ii) a backward optical flow field, each determined based on the target image and the reference image; determining an extent to which the forward optical flow field complies with an epipolar constraint of the reference image and the target image; determining an extent of parallax between respective portions of the target image and the reference image; and determining the confidence map based on (i) the left-right consistency, (ii) the extent to which the forward optical flow field complies with the epipolar constraint, and (iii) the extent of parallax. 13. The computer-implemented method of claim 1 , further comprising: applying a focus effect to a selected feature of the target image based on the dynamic depth image. 14. The computer-implemented method of claim 1 , further comprising: inserting into the target image a visual representation of an object at a selected position within the environment; determining, based on the dynamic depth image and the selected position, an occlusion between the visual representation of the object and at least one feature of the target image; and rendering the target image to indicate the object, the at least one feature, and the occlusion therebetween. 15. The computer-implemented method of claim 1 , wherein the reference image and the target image form part of a video, and wherein the method further comprises: removing from the target image a visual representation of the moving feature; and inpainting, based on other image frames within the video and the dynamic depth image, portions of the environment within the target image that, prior to removal of the moving feature, were occluded by the moving feature and have been exposed by removal of the moving feature. 16. The computer-implemented method of claim 1 , wherein the reference image and the target image form part of a video, and wherein the method further com

Assignees

Google Llc

Inventors

Classifications

G06T7/73
using feature-based methods · CPC title
G06T7/246
using feature-based methods, e.g. the tracking of corners or segments · CPC title
G06T7/215
Motion-based segmentation · CPC title
G06T2207/30244
Camera pose · CPC title
G06T7/579Primary
from motion · CPC title

Patent family

Related publications grouped by family.

View patent family 74881264

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11663733B2 cover?: A method includes obtaining a reference image and a target image each representing an environment containing moving features and static features. The method also includes determining an object mask configured to mask out the moving features and preserves the static features in the target image. The method additionally includes determining, based on motion parallax between the reference image an…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G06T7/579. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 30 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).