Utilizing voxel feature transformations for deep novel view synthesis
US-2021312698-A1 · Oct 7, 2021 · US
US11710287B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11710287-B2 |
| Application number | US-202017309817-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 4, 2020 |
| Priority date | Jun 30, 2020 |
| Publication date | Jul 25, 2023 |
| Grant date | Jul 25, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are described for generating a plurality of three-dimensional (3D) proxy geometries of an object, generating, based on the plurality of 3D proxy geometries, a plurality of neural textures of the object, the neural textures defining a plurality of different shapes and appearances representing the object, providing the plurality of neural textures to a neural renderer, receiving, from the neural renderer and based on the plurality of neural textures, a color image and an alpha mask representing an opacity of at least a portion of the object, and generating a composite image based on the pose, the color image, and the alpha mask.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method utilizing at least one processing device to perform operations including: receiving a pose associated with an object in image content; generating a plurality of three-dimensional (3D) proxy geometries of the object, the plurality of 3D proxy geometries being based on a shape of the object; generating, based on the plurality of 3D proxy geometries, a plurality of neural textures of the object, each of the plurality of neural textures defining a plurality of different shapes and appearances representing the object, the plurality of neural textures being configured to reconstruct a hidden portion of the object captured in the image content; providing the plurality of neural textures to a neural renderer, the plurality of neural textures being provided in a stacked formation, the hidden portion being reconstructed based on the stacked formation of the plurality of neural textures; generating, by the neural renderer, transparent layers of the object and surfaces behind the transparent layers of the object based on the plurality of neural textures; receiving, from the neural renderer and based on the plurality of neural textures, a color image and an alpha mask representing an opacity of at least a portion of the object; and generating a composite image based on the pose, the color image, and the alpha mask. 2. The method of claim 1 , further comprising: rendering a latent texture onto a target viewpoint based at least in part on the pose associated with the object, wherein each of the plurality of 3D proxy geometries include a geometric approximation of at least a portion of the object and the latent texture of the object mapped to the geometric approximation. 3. The method of claim 1 , wherein each of the plurality of 3D proxy geometries encode surface light field associated with the object in the image content, the surface light field including specular reflections associated with the object. 4. The method of claim 1 , wherein the plurality of neural textures are based, at least in part, on the pose, each of the plurality of neural textures being generated by: identifying a category of the object; generating a feature map based on the identified category of the object; providing the feature map to a neural network; and generating each of the plurality of neural textures based on a latent code associated with each instance of the identified category and a view associated with the pose. 5. The method of claim 1 , wherein at least a portion of the object is a transparent material. 6. The method of claim 1 , wherein at least a portion of the object is a reflective material. 7. The method of claim 1 , wherein: the image content includes image data including at least a user; and the object includes a pair of eyeglasses. 8. A system comprising: at least one processing device; and a memory storing instructions that when executed cause the system to perform operations including: receiving a pose associated with an object in image content; generating a plurality of three-dimensional (3D) proxy geometries of the object, the plurality of 3D proxy geometries being based on a shape of the object; generating, based on the plurality of 3D proxy geometries, a plurality of neural textures of the object, each of the plurality of neural textures defining a plurality of different shapes and appearances representing the object, the plurality of neural textures being configured to reconstruct a hidden portion of the object captured in the image content; providing the plurality of neural textures to a neural renderer, the plurality of neural textures being provided in a stacked formation, the hidden portion being reconstructed based on the stacked formation of the plurality of neural textures; generating, by the neural renderer, transparent layers of the object and surfaces behind the transparent layers of the object based on the plurality of neural textures; receiving, from the neural renderer and based on the plurality of neural textures, a color image and an alpha mask representing an opacity of at least a portion of the object; and generating a composite image based on the color image and the alpha mask. 9. The system of claim 8 , further comprising: rendering a latent texture onto a target viewpoint based at least in part on the pose associated with the object, wherein each of the plurality of 3D proxy geometries include a geometric approximation of at least a portion of the object and the latent texture of the object mapped to the geometric approximation. 10. The system of claim 8 , wherein each of the plurality of 3D proxy geometries encode surface light field associated with the object in the image content, the surface light field including specular reflections associated with the object. 11. The system of claim 8 , wherein the plurality of neural textures are based, at least in part, on the pose, each of the plurality of neural textures being generated by: identifying a category of the object; generating a feature map based on the identified category of the object; providing the feature map to a neural network; and generating each of the plurality of neural textures based on a latent code associated with each instance of the identified category and a view associated with the pose. 12. The system of claim 11 , wherein the neural renderer uses a generative model to reconstruct unseen object instances within the identified category, the reconstruction based on less than four captured views of the object. 13. The system of claim 8 , wherein the plurality of 3D proxy geometries are based on geometry interpolation of shapes that construct the object in the image content. 14. A non-transitory, machine-readable medium having instructions stored thereon, the instructions, when executed by a processor, cause a computing device to: receiving a pose associated with an object in image content; generate a plurality of three-dimensional (3D) proxy geometries of the object, the plurality of 3D proxy geometries being based on a shape of the object; generate, based on the plurality of 3D proxy geometries, a plurality of neural textures of the object, each of the plurality of neural textures defining a plurality of different shapes and appearances representing the object, the plurality of neural textures being configured to reconstruct a hidden portion of the object captured in the image content; provide the plurality of neural textures to a neural renderer, the plurality of neural textures being provided in a stacked formation, the hidden portion being reconstructed based on the stacked formation of the plurality of neural textures; generate, by the neural renderer, transparent layers of the object and surfaces behind the transparent layers of the object based on the plurality of neural textures; receive, from the neural renderer and based on the plurality of neural textures, a color image and an alpha mask representing an opacity of at least a portion of the object; and generate a composite image based on the color image and the alpha mask. 15. The machine-readable medium of claim 14 , further comprising: rendering a latent texture onto a target viewpoint based at least in part on the pose associated with the object, wherein each of the plurality of 3D texture proxy geometries include a geometric approximation of at least a portion of the object and the latent texture of the object mapped to the geometric approximation. 16. The machine-readable medium of claim 14 , wherein the plurality of neural textures are based, at least in p
Artificial neural networks [ANN] · CPC title
Shape modification · CPC title
Colour editing, changing, or manipulating; Use of colour codes · CPC title
structured as a network, e.g. client-server architectures · CPC title
Editing of three-dimensional [3D] images, e.g. changing shapes or colours, aligning objects or positioning parts · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.