3D object reconstruction using photometric mesh representation

US11189094B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11189094-B2
Application numberUS-202016985402-A
CountryUS
Kind codeB2
Filing dateAug 5, 2020
Priority dateMay 24, 2019
Publication dateNov 30, 2021
Grant dateNov 30, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are disclosed for 3D object reconstruction using photometric mesh representations. A decoder is pretrained to transform points sampled from 2D patches of representative objects into 3D polygonal meshes. An image frame of the object is fed into an encoder to get an initial latent code vector. For each frame and camera pair from the sequence, a polygonal mesh is rendered at the given viewpoints. The mesh is optimized by creating a virtual viewpoint, rasterized to obtain a depth map. The 3D mesh projections are aligned by projecting the coordinates corresponding to the polygonal face vertices of the rasterized mesh to both selected viewpoints. The photometric error is determined from RGB pixel intensities sampled from both frames. Gradients from the photometric error are backpropagated into the vertices of the assigned polygonal indices by relating the barycentric coordinates of each image to update the latent code vector.

First claim

Opening claim text (preview).

What is claimed is: 1. In a digital medium environment for editing digital images, a computer-implemented method of three-dimensional object reconstruction, the method comprising: transforming a set of two-dimensional (2D) data points representing one or more shape priors into a first set of one or more latent feature vectors representing a shape of a first object; generating a reconstructed representation of the first object in three dimensions based on the first set of latent feature vectors and a second set of latent feature vectors representing a pre-defined shape of a second object; and causing the reconstructed representation of the first object to be output to an output device. 2. The method of claim 1 , further comprising generating a polygonal mesh representing the shape of the first object using an object mesh generation neural network trained to transform the set of 2D data points representing the first object into the polygonal mesh. 3. The method of claim 2 , wherein generating the polygonal mesh includes: selecting an image of the first object from an image sequence; generating the first set of latent feature vectors by feeding the selected image of the first object into an encoder of the neural network; and rendering the polygonal mesh based on the first set of latent feature vectors. 4. The method of claim 2 , further comprising optimizing the polygonal mesh over the second set of latent feature vectors using a photometric objective function. 5. The method of claim 4 , wherein optimizing the polygonal mesh includes: selecting a pair of consecutive frames from an image sequence; creating a virtual viewpoint for the selected pair of consecutive frames; calculating photometric error as a difference between pixel intensities sampled from both frames of the pair of consecutive frames; and backpropagating a gradient of the photometric error into vertices of the polygonal mesh. 6. The method of claim 5 , wherein the photometric objective function includes applying a parameterized warp function to pixels in at least two images of the image sequence. 7. The method of claim 5 , wherein the virtual viewpoint accounts for multi-view geometry associated with first and second camera viewpoints, the method further comprising: rasterizing the polygonal mesh to obtain a depth map from the virtual viewpoint; aligning the rasterized polygonal mesh by projecting a set of three-dimensional (3D) points from the depth map to each of the first and second camera viewpoints; and sampling one or more pixel intensities from the aligned rasterized polygonal mesh. 8. A computer program product including one or more non-transitory machine-readable mediums having instructions encoded thereon that when executed by at least one processor causes a process to be carried out for 3D object reconstruction using photometric mesh representations, the process comprising: transforming a set of two-dimensional (2D) data points representing one or more shape priors into a first set of latent feature vectors representing a shape of a first object; generating a reconstructed representation of the first object in three dimensions based on the first set of latent feature vectors and a second set of latent feature vectors representing a pre-defined shape of a second object using a photometric objective function; and causing the reconstructed representation of the first object to be output to an output device. 9. The computer program product of claim 8 , wherein the process further comprises generating a polygonal mesh representing the shape of the first object using an object mesh generation neural network trained to transform the set of two-dimensional (2D) data points representing the first object into the polygonal mesh. 10. The computer program product of claim 9 , wherein generating the polygonal mesh includes: selecting an image of the first object from an image sequence; generating the first set of latent feature vectors by feeding the selected image of the first object into an encoder of the neural network; and rendering the polygonal mesh based on the first set of latent feature vectors. 11. The computer program product of claim 9 , wherein the process further comprises optimizing the polygonal mesh over the second set of latent feature vectors using the photometric objective function. 12. The computer program product of claim 11 , wherein optimizing the polygonal mesh includes: selecting a pair of consecutive frames from an image sequence; creating a virtual viewpoint for the selected pair of consecutive frames, the virtual viewpoint accounting for multi-view geometry associated with first and second camera viewpoints; rasterizing the polygonal mesh to obtain a depth map from the virtual viewpoint; aligning the rasterized polygonal mesh by projecting a set of three-dimensional (3D) points from the depth map to each of the first and second camera viewpoints; sampling one or more pixel intensities from the aligned rasterized polygonal mesh; and backpropagating gradients from a difference between the one or more sampled pixel intensities into vertices of the polygonal mesh. 13. The computer program product of claim 12 , wherein the photometric objective function includes applying a parameterized warp function to pixels in at least two images of the image sequence. 14. The computer program product of claim 12 , wherein backpropagating gradients from a difference between the one or more sampled pixel intensities into vertices of the polygonal mesh includes backpropagating a gradient of the photometric error into vertices of the polygonal mesh, and wherein the process further comprises calculating the photometric error as the difference between pixel intensities sampled from both frames of the pair of consecutive frames. 15. A system for 3D object reconstruction using photometric mesh representations, the system comprising: a means for generating a polygonal mesh representing a shape of a first object; a means for optimizing the polygonal mesh over a set of latent feature vectors using a photometric objective function to produce a reconstructed representation of the first object in three dimensions, the set of latent feature vectors representing a pre-defined shape of a second object; and a means for causing the reconstructed representation of the first object to be at least one of displayed or printed. 16. The system of claim 15 , wherein the means for generating a polygonal mesh is configured to generate the polygonal mesh using an object mesh generation neural network trained to transform a set of two-dimensional (2D) data points representing the first object into the polygonal mesh, the set of 2D data points representing color pixels in at least two images of the first object, the at least two images having different camera poses. 17. The system of claim 16 , further comprising a means for training the object mesh generation neural network to transform the set of 2D data points into the polygonal mesh using 3D computer aided drawing (CAD) model renderings. 18. The system of claim 15 , wherein the photometric objective function represents, at least in part, a photometric loss contributed by pixels in at least one face of the polygonal mesh. 19. The system of claim 15 , wherein the photometric objective function is: ℒ phot ( j )

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Supervised learning · CPC title

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11189094B2 cover?
Techniques are disclosed for 3D object reconstruction using photometric mesh representations. A decoder is pretrained to transform points sampled from 2D patches of representative objects into 3D polygonal meshes. An image frame of the object is fed into an encoder to get an initial latent code vector. For each frame and camera pair from the sequence, a polygonal mesh is rendered at the given v…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification B33Y50/00. Mapped technology areas include Operations & Transport.
When was this patent published?
Publication date Tue Nov 30 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).