Viewpoint dependent brick selection for fast volumetric reconstruction
US-2019197777-A1 · Jun 27, 2019 · US
US10885707B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10885707-B1 |
| Application number | US-202016882477-A |
| Country | US |
| Kind code | B1 |
| Filing date | May 23, 2020 |
| Priority date | Jul 23, 2019 |
| Publication date | Jan 5, 2021 |
| Grant date | Jan 5, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A network for generating 3D shape includes a perceptual network and a Graphic Convolutional Network (GCN). The GCN includes a coarse shape generation network for generating a coarse shape, and a Multi-View Deformation Network (MDN) for refining the coarse shape. The MDN further comprises at least one MDN unit, which in turn comprises a deformation hypothesis sampling module, a cross-view perceptual feature pooling module and a deformation reasoning module. Systems and methods are also provided.
Opening claim text (preview).
The invention claimed is: 1. A network for generating 3D shape, comprising a perceptual network and a Graphic Convolutional Network (GCN), wherein: the perceptual network is configured to extract geometry features and semantic features from a plurality of input images; the GCN comprises a coarse shape generation network for generating a coarse shape, and a Multi-View Deformation Network (MDN) for refining the coarse shape; wherein the coarse shape generation network is so configured as to output a coarse mesh based on the semantic features extracted from the perceptual network and an initial ellipsoid mesh; the MDN comprises at least one MDN unit, which comprises a deformation hypothesis sampling module, a cross-view perceptual feature pooling module and a deformation reasoning module; wherein the deformation hypothesis sampling module is so configured that a set of deformation hypotheses positions are sampled for each vertex of the coarse mesh from its surrounding area; the cross-view perceptual feature pooling module is serial to the deformation hypothesis sampling module, and is so configured as to pool the geometry features for each vertex and its hypotheses positions in a cross-view manner; and the deformation reasoning module is serial to the cross-view perceptual feature pooling module, and is so configured to output a refined mesh based on the pooled geometry features of each vertex and its hypotheses positions. 2. The network of claim 1 , wherein the MDN includes more than one serially connected MDN units, and the refined mesh output from a preceding MDN unit is iteratively used as the coarse mesh input to the superseding MDN unit. 3. The network of claim 2 , wherein the number of the MDN units is two. 4. The network of claim 1 , wherein the perceptual network is preferably a 2D Convolutional Neural Network (CNN). 5. The network of claim 4 , wherein the geometry features are extracted from early layers of the 2D CNN. 6. The network of claim 1 , wherein the coarse shape generation network is a Pixel2Mesh network. 7. The network of claim 6 , wherein the Pixel2Mesh network is equipped with a cross-view perceptual feature pooling layer. 8. The network of claim 1 , wherein the set of deformation hypotheses positions are sampled from an icosahedron centered on the vertex. 9. The network of claim 8 , wherein the icosahedron is a level-1 icosahedron. 10. The network of claim 8 , wherein, the set of deformation hypotheses positions are sampled with a scale of 0.02, as the size of the icosahedron is normalized as 1. 11. The network of claim 1 , wherein the cross-view perceptual feature pooling module being so configured as to pool the geometry features for each vertex and its hypotheses positions in a cross-view manner, further includes finding the projections for each vertex and its hypothesis positions in the planes of the plurality of input images and then pooling the geometry features for each vertex and its hypothesis positions. 12. The network of claim 11 , wherein the projections of each vertex and its hypothesis position are found in the planes of the plurality of input images by using known camera intrinsics and extrinsics. 13. The network of claim 11 , wherein the geometry features of each vertex and its hypothesis positions are pooled four neighboring feature blocks. 14. The network of claim 11 , wherein the geometry features of each vertex and its hypothesis positions are pooled by using bilinear interpolation. 15. The network of claim 1 , wherein the deformation reasoning module includes a scoring network, and wherein the pooled perceptual features of each vertex and its hypotheses positions are fed into the scoring network. 16. The network of claim 15 , wherein a weight for each vertex and its hypotheses positions is estimated in the scoring network, and a weighted sum of the vertex and all of its hypotheses positions for each vertex is calculated based on the weight of each vertex and its hypotheses positions and the pooled geometry features of each vertex and its hypotheses positions. 17. The network of claim 15 , wherein the coordinates for each vertex is obtained from the weighted sum of the vertex, and the refined mesh is generated according to the coordinates of each vertex. 18. The network of claim 15 , wherein the scoring network is a G-ResNet consisting of six graph residual convolution layers, with each layer added with a Rectifier Linear Unit (ReLU). 19. A system for generating 3D shape, comprising an input device, a processor for processing the input data, and an output device for outputting the processed data; wherein the processor is configured to build a computing model including a perceptual network and a Graphic Convolutional Network (GCN), wherein: the perceptual network is so configured to extract geometry features and semantic features from a plurality of input images; the GCN includes a coarse shape generation network for generating a coarse shape, and a Multi-View Deformation Network (MDN) for refining the coarse shape; wherein the coarse shape generation network is so configured as to output a coarse mesh based on the semantic features extracted from the perceptual network and an initial ellipsoid mesh; the MDN comprises at least one MDN unit, which comprises a deformation hypothesis sampling module, a cross-view perceptual feature pooling module and a deformation reasoning module; wherein the deformation hypothesis sampling module is so configured that a set of deformation hypotheses positions are sampled for each vertex of the coarse mesh from its surrounding area; the cross-view perceptual feature pooling module is serial to the deformation hypothesis sampling module, and is so configured as to pool the geometry features for each vertex and its hypotheses positions in a cross-view manner; the deformation reasoning module is serial to the cross-view perceptual feature pooling module, and is so configured to output a refined mesh based on the pooled geometry features of each vertex and its hypotheses positions.
Combinations of networks · CPC title
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Three-dimensional [3D] modelling for computer graphics · CPC title
Re-meshing · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.