What technology area does this patent fall under?

Primary CPC classification G06T7/254. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 22 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Learning rigidity of dynamic scenes for three-dimensional scene flow estimation

US11508076B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11508076-B2
Application number	US-202117156406-A
Country	US
Kind code	B2
Filing date	Jan 22, 2021
Priority date	Aug 16, 2017
Publication date	Nov 22, 2022
Grant date	Nov 22, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A neural network model receives color data for a sequence of images corresponding to a dynamic scene in three-dimensional (3D) space. Motion of objects in the image sequence results from a combination of a dynamic camera orientation and motion or a change in the shape of an object in the 3D space. The neural network model generates two components that are used to produce a 3D motion field representing the dynamic (non-rigid) part of the scene. The two components are information identifying dynamic and static portions of each image and the camera orientation. The dynamic portions of each image contain motion in the 3D space that is independent of the camera orientation. In other words, the motion in the 3D space (estimated 3D scene flow data) is separated from the motion of the camera.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving color data for a first image and a second image corresponding to a scene in three-dimensional (3D) space, wherein the first image is captured from a first viewpoint and the second image is captured from a second viewpoint; and processing the color data by layers of a neural network model to generate segmentation data comprising: viewpoint pose motion corresponding to a change in pose between the first viewpoint and the second viewpoint; and a mask indicating a portion of the second image where a first object changes position or shape relative to a position or shape of the first object in the first image, wherein the change results from a combination of the viewpoint pose motion and motion within the scene. 2. The computer-implemented method of claim 1 , wherein the viewpoint pose motion includes a rotation and translation. 3. The computer-implemented method of claim 1 , wherein the first image is captured at a first time and the second image is captured at a second time that is after the first time. 4. The computer-implemented method of claim 1 , further comprising refining the viewpoint pose motion based on two-dimensional optical flow data for the first image and the second image. 5. The computer-implemented method of claim 1 , further comprising refining the segmentation data based on two-dimensional optical flow data for a sequence of images that includes the first image and the second image. 6. The computer-implemented method of claim 1 , further comprising: receiving depth data for the first image and the second image; and processing the depth data with the color data to generate the segmentation data. 7. The computer-implemented method of claim 1 , further comprising: processing the first image and the second image to extract depth data; and processing the depth data with the color data to generate the segmentation data. 8. The computer-implemented method of claim 1 , further comprising processing the first image and the viewpoint pose motion to generate two-dimensional (2D) viewpoint motion flow data. 9. The computer-implemented method of claim 8 , further comprising subtracting the 2D viewpoint motion flow data from 2D optical flow data associated with at least one of the first image and the second image to produce 2D scene flow data. 10. The computer-implemented method of claim 8 , wherein the 2D viewpoint motion flow data is associated with a camera. 11. The computer-implemented method of claim 1 , wherein the neural network model is included within a machine, robot, or autonomous vehicle. 12. The computer-implemented method of claim 1 , wherein the neural network model is included within a system configured to train, test, or certify a machine, robot, or autonomous vehicle. 13. A system, comprising: a processor configured to: receive color data for a first image and a second image corresponding to a scene in three-dimensional (3D) space, wherein the first image is captured from a first viewpoint and the second image is captured from a second viewpoint; and process the color data to generate segmentation data comprising: viewpoint pose motion corresponding to a change in pose between the first viewpoint and the second viewpoint; and a mask indicating a portion of the second image where a first object changes position or shape relative to a position or shape of the first object in the first image, wherein the change results from a combination of the viewpoint pose motion and motion within the scene. 14. The system of claim 13 , wherein the viewpoint pose motion includes a rotation and translation. 15. The system of claim 13 , wherein the first image is captured at a first time and the second image is captured at a second time that is after the first time. 16. The system of claim 13 , wherein the processor is further configured to refine the viewpoint pose motion based on two-dimensional optical flow data for the first image and the second image. 17. The system of claim 13 , wherein the processor is further configured to refine the segmentation data based on two-dimensional optical flow data for a sequence of images that includes the first image and the second image. 18. The system of claim 13 , wherein the processor is further configured to process the first image and the viewpoint pose motion to generate two-dimensional (2D) viewpoint motion flow data. 19. The system of claim 18 , wherein the processor is further configured to subtract the 2D viewpoint motion flow data from 2D optical flow data associated with the first image and the second image to produce 2D scene flow data. 20. The system of claim 18 , wherein the 2D viewpoint motion flow data is associated with a camera. 21. A non-transitory, computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to: receive color data for a first image and a second image corresponding to a scene in three-dimensional (3D) space, wherein the first image is captured from a first viewpoint and the second image is captured from a second viewpoint; and process the color data by layers of a neural network model to generate segmentation data comprising: viewpoint pose motion corresponding to a change in pose between the first viewpoint and the second viewpoint; and a mask indicating a portion of the second image where a first object changes position or shape relative to a position or shape of the first object in the first image, wherein the change results from a combination of the viewpoint pose motion and motion within the scene.

Assignees

Nvidia Corp

Inventors

Classifications

G06V20/10
Terrestrial scenes (scenes under surveillance with static cameras G06V20/52; scenes perceived from the exterior of a vehicle G06V20/56; scenes perceived from the interior of a vehicle G06V20/59) · CPC title
G06V10/774
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06V20/64
Three-dimensional [3D] objects · CPC title
G06V10/772
Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries · CPC title
G06V10/955
using specific electronic processors · CPC title

Patent family

Related publications grouped by family.

View patent family 65361107

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11508076B2 cover?: A neural network model receives color data for a sequence of images corresponding to a dynamic scene in three-dimensional (3D) space. Motion of objects in the image sequence results from a combination of a dynamic camera orientation and motion or a change in the shape of an object in the 3D space. The neural network model generates two components that are used to produce a 3D motion field repre…
Who is the assignee on this patent?: Nvidia Corp
What technology area does this patent fall under?: Primary CPC classification G06T7/254. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 22 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).