Weak multi-view supervision for surface mapping estimation

US2022343601A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022343601-A1
Application numberUS-202217659449-A
CountryUS
Kind codeA1
Filing dateApr 15, 2022
Priority dateApr 21, 2021
Publication dateOct 27, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One or more two-dimensional images of a three-dimensional object may be analyzed to estimate a three-dimensional mesh representing the object and a mapping of the two-dimensional images to the three-dimensional mesh. Initially, a correspondence may be determined between the images and a UV representation of a three-dimensional template mesh by training a neural network. Then, the three-dimensional template mesh may be deformed to determine the representation of the object. The process may involve a reprojection loss cycle in which points from the images are mapped onto the UV representation, then onto the three-dimensional template mesh, and then back onto the two-dimensional images.

First claim

Opening claim text (preview).

The invention claimed is: 1 . A method comprising: determining via a processor a correspondence between one or more two-dimensional images of a three-dimensional object and a UV representation of a three-dimensional template mesh of the three-dimensional object by training a neural network, the three-dimensional template mesh including a plurality of points in three-dimensional space and a plurality of edges between the plurality of points; determining via the processor a deformation of the three-dimensional template mesh, the deformation displacing one or more of the plurality of points, wherein the deformation is determined so as to reduce reprojection consistency loss when mapping points from the two-dimensional images back onto the two-dimensional images through both the UV representation and the three-dimensional template mesh; and storing on a storage device a deformed three-dimensional template mesh. 2 . The method recited in claim 1 , wherein training the neural network comprises predicting, for a first location in a designated one of the two-dimensional images, a corresponding second location in the UV representation. 3 . The method recited in claim 2 , wherein training the neural network further comprises determining a third location in the three-dimensional template mesh by mapping the second location to the third location via the UV parameterization. 4 . The method recited in claim 3 , wherein training the neural network further comprises determining a fourth location in the designated two-dimensional image by projecting the third location onto a virtual camera pose associated with the designated two-dimensional image. 5 . The method recited in claim 4 , wherein training the neural network further comprises determining a reprojection consistency loss value representing a displacement in two-dimensional space between the first location and the fourth location. 6 . The method recited in claim 5 , wherein training the neural network further comprises updating the neural network based on the reprojection consistency loss value. 7 . The method recited in claim 4 , the method further comprising: determining the virtual camera pose by analyzing the two-dimensional image to identify a virtual camera position and virtual camera orientation for the two-dimensional image relative to the three-dimensional template mesh. 8 . The method recited in claim 4 , wherein the one or more two-dimensional images include at least the designated two-dimensional image and a proximate two-dimensional image, the proximate two-dimensional image being captured from a proximate virtual camera pose that is proximate to the virtual camera pose, wherein the reprojection consistency loss value depends in part on a proximate reprojection consistency loss value computed for a corresponding pixel in the proximate two-dimensional image. 9 . The method recited in claim 1 , wherein training the neural network comprises determining a visibility loss value representing occlusion of a designated portion of the three-dimensional object within a designated one of the two-dimensional images and update the neural network based on the visibility loss value. 10 . The method recited in claim 1 , the method further comprising: determining an object type corresponding to the three-dimensional object by analyzing one or more of the one or more two-dimensional images; and selecting the three-dimensional template mesh from a plurality of available three-dimensional template meshes, the three-dimensional template mesh corresponding with the object type. 11 . The method recited in claim 10 , wherein the object type is a vehicle, and wherein the three-dimensional template mesh provides a generic representation of vehicles. 12 . The method recited in claim 10 , wherein the object type is a vehicle sub-type, and wherein the three-dimensional template mesh provides a generic representation of the vehicle sub-type. 13 . A computing system comprising a processor and a storage device, the computing system configured to perform a method comprising: determining via the processor a correspondence between one or more two-dimensional images of a three-dimensional object and a UV representation of a three-dimensional template mesh of the three-dimensional object by training a neural network, the three-dimensional template mesh including a plurality of points in three-dimensional space and a plurality of edges between the plurality of points; determining via the processor a deformation of the three-dimensional template mesh, the deformation displacing one or more of the plurality of points, wherein the deformation is determined so as to reduce reprojection consistency loss when mapping points from the two-dimensional images back onto the two-dimensional images through both the UV representation and the three-dimensional template mesh; and storing on the storage device a deformed three-dimensional template mesh. 14 . The computing system recited in claim 13 , wherein training the neural network comprises predicting, for a first location in a designated one of the two-dimensional images, a corresponding second location in the UV representation, wherein training the neural network further comprises determining a third location in the three-dimensional template mesh by mapping the second location to the third location via the UV parameterization, wherein training the neural network further comprises determining a fourth location in the designated two-dimensional image by projecting the third location onto a virtual camera pose associated with the designated two-dimensional image, wherein training the neural network further comprises determining a reprojection consistency loss value representing a displacement in two-dimensional space between the first location and the fourth location, wherein training the neural network further comprises updating the neural network based on the reprojection consistency loss value. 15 . The computing system recited in claim 14 , the method further comprising: determining the virtual camera pose by analyzing the two-dimensional image to identify a virtual camera position and virtual camera orientation for the two-dimensional image relative to the three-dimensional template mesh. 16 . The computing system recited in claim 14 , wherein the one or more two-dimensional images include at least the designated two-dimensional image and a proximate two-dimensional image, the proximate two-dimensional image being captured from a proximate virtual camera pose that is proximate to the virtual camera pose, wherein the reprojection consistency loss value depends in part on a proximate reprojection consistency loss value computed for a corresponding pixel in the proximate two-dimensional image. 17 . The computing system recited in claim 13 , wherein training the neural network comprises determining a visibility loss value representing occlusion of a designated portion of the three-dimensional object within a designated one of the two-dimensional images and update the neural network based on the visibility loss value. 18 . The computing system recited in claim 13 , the method further comprising: determining an object type corresponding to the three-dimensional object by analyzing one or more of the one or more two-dimensional images; and selecting the three-dimensional template mesh from a plurality of available three-dimensional template meshes, the three-dimensional template mesh corresponding with the object type. 19 . One or more non-t

Assignees

Inventors

Classifications

  • G06V20/647Primary

    by matching two-dimensional images to three-dimensional objects · CPC title

  • using neural networks · CPC title

  • Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Determining position or orientation of objects or cameras (camera calibration G06T7/80) · CPC title

  • Vehicle exterior or interior · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022343601A1 cover?
One or more two-dimensional images of a three-dimensional object may be analyzed to estimate a three-dimensional mesh representing the object and a mapping of the two-dimensional images to the three-dimensional mesh. Initially, a correspondence may be determined between the images and a UV representation of a three-dimensional template mesh by training a neural network. Then, the three-dimensio…
Who is the assignee on this patent?
Fyusion Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/647. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 27 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).