Learning rigidity of dynamic scenes for three-dimensional scene flow estimation

US11508076B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11508076-B2
Application numberUS-202117156406-A
CountryUS
Kind codeB2
Filing dateJan 22, 2021
Priority dateAug 16, 2017
Publication dateNov 22, 2022
Grant dateNov 22, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A neural network model receives color data for a sequence of images corresponding to a dynamic scene in three-dimensional (3D) space. Motion of objects in the image sequence results from a combination of a dynamic camera orientation and motion or a change in the shape of an object in the 3D space. The neural network model generates two components that are used to produce a 3D motion field representing the dynamic (non-rigid) part of the scene. The two components are information identifying dynamic and static portions of each image and the camera orientation. The dynamic portions of each image contain motion in the 3D space that is independent of the camera orientation. In other words, the motion in the 3D space (estimated 3D scene flow data) is separated from the motion of the camera.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving color data for a first image and a second image corresponding to a scene in three-dimensional (3D) space, wherein the first image is captured from a first viewpoint and the second image is captured from a second viewpoint; and processing the color data by layers of a neural network model to generate segmentation data comprising: viewpoint pose motion corresponding to a change in pose between the first viewpoint and the second viewpoint; and a mask indicating a portion of the second image where a first object changes position or shape relative to a position or shape of the first object in the first image, wherein the change results from a combination of the viewpoint pose motion and motion within the scene. 2. The computer-implemented method of claim 1 , wherein the viewpoint pose motion includes a rotation and translation. 3. The computer-implemented method of claim 1 , wherein the first image is captured at a first time and the second image is captured at a second time that is after the first time. 4. The computer-implemented method of claim 1 , further comprising refining the viewpoint pose motion based on two-dimensional optical flow data for the first image and the second image. 5. The computer-implemented method of claim 1 , further comprising refining the segmentation data based on two-dimensional optical flow data for a sequence of images that includes the first image and the second image. 6. The computer-implemented method of claim 1 , further comprising: receiving depth data for the first image and the second image; and processing the depth data with the color data to generate the segmentation data. 7. The computer-implemented method of claim 1 , further comprising: processing the first image and the second image to extract depth data; and processing the depth data with the color data to generate the segmentation data. 8. The computer-implemented method of claim 1 , further comprising processing the first image and the viewpoint pose motion to generate two-dimensional (2D) viewpoint motion flow data. 9. The computer-implemented method of claim 8 , further comprising subtracting the 2D viewpoint motion flow data from 2D optical flow data associated with at least one of the first image and the second image to produce 2D scene flow data. 10. The computer-implemented method of claim 8 , wherein the 2D viewpoint motion flow data is associated with a camera. 11. The computer-implemented method of claim 1 , wherein the neural network model is included within a machine, robot, or autonomous vehicle. 12. The computer-implemented method of claim 1 , wherein the neural network model is included within a system configured to train, test, or certify a machine, robot, or autonomous vehicle. 13. A system, comprising: a processor configured to: receive color data for a first image and a second image corresponding to a scene in three-dimensional (3D) space, wherein the first image is captured from a first viewpoint and the second image is captured from a second viewpoint; and process the color data to generate segmentation data comprising: viewpoint pose motion corresponding to a change in pose between the first viewpoint and the second viewpoint; and a mask indicating a portion of the second image where a first object changes position or shape relative to a position or shape of the first object in the first image, wherein the change results from a combination of the viewpoint pose motion and motion within the scene. 14. The system of claim 13 , wherein the viewpoint pose motion includes a rotation and translation. 15. The system of claim 13 , wherein the first image is captured at a first time and the second image is captured at a second time that is after the first time. 16. The system of claim 13 , wherein the processor is further configured to refine the viewpoint pose motion based on two-dimensional optical flow data for the first image and the second image. 17. The system of claim 13 , wherein the processor is further configured to refine the segmentation data based on two-dimensional optical flow data for a sequence of images that includes the first image and the second image. 18. The system of claim 13 , wherein the processor is further configured to process the first image and the viewpoint pose motion to generate two-dimensional (2D) viewpoint motion flow data. 19. The system of claim 18 , wherein the processor is further configured to subtract the 2D viewpoint motion flow data from 2D optical flow data associated with the first image and the second image to produce 2D scene flow data. 20. The system of claim 18 , wherein the 2D viewpoint motion flow data is associated with a camera. 21. A non-transitory, computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to: receive color data for a first image and a second image corresponding to a scene in three-dimensional (3D) space, wherein the first image is captured from a first viewpoint and the second image is captured from a second viewpoint; and process the color data by layers of a neural network model to generate segmentation data comprising: viewpoint pose motion corresponding to a change in pose between the first viewpoint and the second viewpoint; and a mask indicating a portion of the second image where a first object changes position or shape relative to a position or shape of the first object in the first image, wherein the change results from a combination of the viewpoint pose motion and motion within the scene.

Assignees

Inventors

Classifications

  • Terrestrial scenes (scenes under surveillance with static cameras G06V20/52; scenes perceived from the exterior of a vehicle G06V20/56; scenes perceived from the interior of a vehicle G06V20/59) · CPC title

  • Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Three-dimensional [3D] objects · CPC title

  • Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries · CPC title

  • using specific electronic processors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11508076B2 cover?
A neural network model receives color data for a sequence of images corresponding to a dynamic scene in three-dimensional (3D) space. Motion of objects in the image sequence results from a combination of a dynamic camera orientation and motion or a change in the shape of an object in the 3D space. The neural network model generates two components that are used to produce a 3D motion field repre…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06T7/254. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 22 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).