Generating images of virtual environments using one or more neural networks

US12387430B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12387430-B2
Application numberUS-202017111271-A
CountryUS
Kind codeB2
Filing dateDec 3, 2020
Priority dateDec 3, 2020
Publication dateAug 12, 2025
Grant dateAug 12, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images based, at least in part, upon one or more semantic features projected from a three-dimensional environment.

First claim

Opening claim text (preview).

What is claimed is: 1. One or more processors, comprising: circuitry to use one or more neural networks to generate one or more 2D images from one or more voxels representative of one or more objects and encoded semantic information corresponding to the one or more voxels. 2. The one or more processors of claim 1 , wherein the circuitry is further to: identify the one or more voxels from a plurality of geometric objects, of one or more object types, used to build a 3D environment. 3. The one or more processors of claim 1 , wherein the one or more objects include one or more geometric objects that are blocks having semantic feature data based at least on information associated with one or more voxels corresponding to one or more comers of the blocks, respective instances of the semantic feature data including at least a respective object type and position data within a 3D environment. 4. The one or more processors of claim 1 , wherein the circuitry is further to determine a set of semantic features visible from a field of view of a virtual camera, and project the set of semantic features into a 2D representation. 5. The one or more processors of claim 1 , wherein the the circuitry is further to encode a plurality of semantic features corresponding to the one or more voxels before determining a set of semantic features to be used in a 2D representation. 6. The one or more processors of claim 1 , wherein the the circuitry is further to generate a semantic segmentation mask from a set of semantic features identified in the one or more voxels, and wherein the one or more neural networks include a generative adversarial network (GAN) for generating the one or more 2D images using the semantic segmentation mask. 7. A system comprising: one or more processors to use one or more neural networks to generate one or more 2D images from one or more voxels representative of one or more objects part, on and encoded semantic information corresponding to the one or more voxels. 8. The system of claim 7 , wherein the one or more processors are further to identify the one or more voxels from a plurality of geometric objects, of one or more object types, used to build a 3D environment. 9. The system of claim 7 , wherein the one or more objects include one or more geometric objects are blocks having semantic feature data based, at least in part, on information associated with one or more voxels corresponding to one or more corners of the blocks, respective instances of the semantic feature data including at least a respective object type and position data within a 3D environment. 10. The system of claim 7 , wherein the one or more processors are further to use an encoder of the one or more neural networks to determine a set of semantic features visible from a field of view of a virtual camera, and use a generator of the one or more neural networks to project the set of semantic features into a 2D semantic feature representation to be used to generate the one or more 2D images. 11. The system of claim 7 , wherein the one or more processors are further to encode a plurality of semantic features before determining a set of semantic features used in the one or more 2D images. 12. The system of claim 7 , wherein the one or more processors are further to generate a semantic segmentation mask from a set of semantic features, and wherein the one or more neural networks include a generative adversarial network (GAN) for generating the one or more 2D objects images using the semantic segmentation mask. 13. A method comprising: using one or more neural networks to generate one or more 2D images from one or more voxels representative of one or more objects and encoded semantic information corresponding to the one or more voxels and provided as input to the one or more neural networks. 14. The method of claim 13 , further comprising: identifying the one or more voxels from a plurality of geometric objects, of one or more object types, used to build a 3D environment. 15. The method of claim 13 , wherein the one or more one or more objects are blocks having semantic feature data based, at least in part, on information associated with one or more voxels corresponding to one or more corners of the blocks, respective instances of the semantic feature data including at least a respective object type and position data within a 3D environment. 16. The method of claim 13 , further comprising: determining a set of semantic features visible from a field of view of a virtual camera, and project the set of semantic features into the one or more 2D images. 17. The method of claim 13 , further comprising: encoding a plurality of semantic features before determining a set of semantic features to be used in the one or more 2D images. 18. The method of claim 13 , further comprising: generating a semantic segmentation mask from a set of semantic features, wherein the one or more neural networks include a generative adversarial network (GAN) for generating the one or more 2D images using the semantic segmentation mask. 19. A non-transitory computer-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least: use one or more neural networks to generate one or more 2D images from one or more voxels representative of one or more objects and encoded semantic information corresponding to the one or more voxels. 20. The non-transitory computer-readable medium of claim 19 , wherein the instructions if performed further cause the one or more processors to: identify one or more semantic features from a plurality of geometric objects, of one or more object types, used to build a 3D environment. 21. The non-transitory computer-readable medium of claim 19 , wherein the one or more objects are blocks having semantic feature data based, at least in part, on information associated with one or more voxels corresponding to one or more comers of the blocks, respective instances of the semantic feature data including at least a respective object type and position data within a 3D environment. 22. The non-transitory computer-readable medium of claim 19 , wherein the instructions if performed further cause the one or more processors to: determine a set of semantic features visible from a field of view of a virtual camera, and project the set of semantic features into the one or more 2D images. 23. The non-transitory computer-readable medium of claim 19 , wherein the instructions if performed further cause the one or more processors to: encode a plurality of semantic features before determining a set of semantic features to be used in the one or more 2D images. 24. The non-transitory computer-readable medium of claim 19 , wherein the instructions if performed further cause the one or more processors to: generate a semantic segmentation mask from a set of semantic features, and wherein the one or more neural networks include a generative adversarial network (GAN) for generating the one or more 2D images using the semantic segmentation mask. 25. An image generation system, comprising: one or more processors to use one or more neural networks to generate one or more 2D images from one or more voxels representative of one or more objects and encoded semantic information corresponding to the one or more voxels; and memory for storing network parameters for the one or more neural networks.

Assignees

Inventors

Classifications

  • by matching two-dimensional images to three-dimensional objects · CPC title

  • Edge-based segmentation · CPC title

  • Artificial neural networks [ANN] · CPC title

  • Aligning objects, relative positioning of parts · CPC title

  • G06T15/10Primary

    Geometric effects · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12387430B2 cover?
Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images based, at least in part, upon one or more semantic features projected from a three-dimensional environment.
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06T15/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 12 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).