Network, system and method for multi-view 3D mesh generation via deformation

US10885707B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10885707-B1
Application numberUS-202016882477-A
CountryUS
Kind codeB1
Filing dateMay 23, 2020
Priority dateJul 23, 2019
Publication dateJan 5, 2021
Grant dateJan 5, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A network for generating 3D shape includes a perceptual network and a Graphic Convolutional Network (GCN). The GCN includes a coarse shape generation network for generating a coarse shape, and a Multi-View Deformation Network (MDN) for refining the coarse shape. The MDN further comprises at least one MDN unit, which in turn comprises a deformation hypothesis sampling module, a cross-view perceptual feature pooling module and a deformation reasoning module. Systems and methods are also provided.

First claim

Opening claim text (preview).

The invention claimed is: 1. A network for generating 3D shape, comprising a perceptual network and a Graphic Convolutional Network (GCN), wherein: the perceptual network is configured to extract geometry features and semantic features from a plurality of input images; the GCN comprises a coarse shape generation network for generating a coarse shape, and a Multi-View Deformation Network (MDN) for refining the coarse shape; wherein the coarse shape generation network is so configured as to output a coarse mesh based on the semantic features extracted from the perceptual network and an initial ellipsoid mesh; the MDN comprises at least one MDN unit, which comprises a deformation hypothesis sampling module, a cross-view perceptual feature pooling module and a deformation reasoning module; wherein the deformation hypothesis sampling module is so configured that a set of deformation hypotheses positions are sampled for each vertex of the coarse mesh from its surrounding area; the cross-view perceptual feature pooling module is serial to the deformation hypothesis sampling module, and is so configured as to pool the geometry features for each vertex and its hypotheses positions in a cross-view manner; and the deformation reasoning module is serial to the cross-view perceptual feature pooling module, and is so configured to output a refined mesh based on the pooled geometry features of each vertex and its hypotheses positions. 2. The network of claim 1 , wherein the MDN includes more than one serially connected MDN units, and the refined mesh output from a preceding MDN unit is iteratively used as the coarse mesh input to the superseding MDN unit. 3. The network of claim 2 , wherein the number of the MDN units is two. 4. The network of claim 1 , wherein the perceptual network is preferably a 2D Convolutional Neural Network (CNN). 5. The network of claim 4 , wherein the geometry features are extracted from early layers of the 2D CNN. 6. The network of claim 1 , wherein the coarse shape generation network is a Pixel2Mesh network. 7. The network of claim 6 , wherein the Pixel2Mesh network is equipped with a cross-view perceptual feature pooling layer. 8. The network of claim 1 , wherein the set of deformation hypotheses positions are sampled from an icosahedron centered on the vertex. 9. The network of claim 8 , wherein the icosahedron is a level-1 icosahedron. 10. The network of claim 8 , wherein, the set of deformation hypotheses positions are sampled with a scale of 0.02, as the size of the icosahedron is normalized as 1. 11. The network of claim 1 , wherein the cross-view perceptual feature pooling module being so configured as to pool the geometry features for each vertex and its hypotheses positions in a cross-view manner, further includes finding the projections for each vertex and its hypothesis positions in the planes of the plurality of input images and then pooling the geometry features for each vertex and its hypothesis positions. 12. The network of claim 11 , wherein the projections of each vertex and its hypothesis position are found in the planes of the plurality of input images by using known camera intrinsics and extrinsics. 13. The network of claim 11 , wherein the geometry features of each vertex and its hypothesis positions are pooled four neighboring feature blocks. 14. The network of claim 11 , wherein the geometry features of each vertex and its hypothesis positions are pooled by using bilinear interpolation. 15. The network of claim 1 , wherein the deformation reasoning module includes a scoring network, and wherein the pooled perceptual features of each vertex and its hypotheses positions are fed into the scoring network. 16. The network of claim 15 , wherein a weight for each vertex and its hypotheses positions is estimated in the scoring network, and a weighted sum of the vertex and all of its hypotheses positions for each vertex is calculated based on the weight of each vertex and its hypotheses positions and the pooled geometry features of each vertex and its hypotheses positions. 17. The network of claim 15 , wherein the coordinates for each vertex is obtained from the weighted sum of the vertex, and the refined mesh is generated according to the coordinates of each vertex. 18. The network of claim 15 , wherein the scoring network is a G-ResNet consisting of six graph residual convolution layers, with each layer added with a Rectifier Linear Unit (ReLU). 19. A system for generating 3D shape, comprising an input device, a processor for processing the input data, and an output device for outputting the processed data; wherein the processor is configured to build a computing model including a perceptual network and a Graphic Convolutional Network (GCN), wherein: the perceptual network is so configured to extract geometry features and semantic features from a plurality of input images; the GCN includes a coarse shape generation network for generating a coarse shape, and a Multi-View Deformation Network (MDN) for refining the coarse shape; wherein the coarse shape generation network is so configured as to output a coarse mesh based on the semantic features extracted from the perceptual network and an initial ellipsoid mesh; the MDN comprises at least one MDN unit, which comprises a deformation hypothesis sampling module, a cross-view perceptual feature pooling module and a deformation reasoning module; wherein the deformation hypothesis sampling module is so configured that a set of deformation hypotheses positions are sampled for each vertex of the coarse mesh from its surrounding area; the cross-view perceptual feature pooling module is serial to the deformation hypothesis sampling module, and is so configured as to pool the geometry features for each vertex and its hypotheses positions in a cross-view manner; the deformation reasoning module is serial to the cross-view perceptual feature pooling module, and is so configured to output a refined mesh based on the pooled geometry features of each vertex and its hypotheses positions.

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • G06T17/00Primary

    Three-dimensional [3D] modelling for computer graphics · CPC title

  • G06T17/205Primary

    Re-meshing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10885707B1 cover?
A network for generating 3D shape includes a perceptual network and a Graphic Convolutional Network (GCN). The GCN includes a coarse shape generation network for generating a coarse shape, and a Multi-View Deformation Network (MDN) for refining the coarse shape. The MDN further comprises at least one MDN unit, which in turn comprises a deformation hypothesis sampling module, a cross-view percep…
Who is the assignee on this patent?
Univ Fudan
What technology area does this patent fall under?
Primary CPC classification G06T17/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 05 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).