3D generation of diverse categories and scenes

US12380640B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12380640-B2
Application numberUS-202218071821-A
CountryUS
Kind codeB2
Filing dateNov 30, 2022
Priority dateNov 30, 2022
Publication dateAug 5, 2025
Grant dateAug 5, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A three-dimensional (3D) scene is generated from non-aligned generic camera priors by producing a tri-plane representation for an input scene received in random latent code, obtaining a camera posterior including posterior parameters representing color and density data from the random latent code and from generic camera priors without alignment assumptions, and volumetrically rendering an image of the input scene from the color and density data to provide a scene having pixel colors and depth values from an arbitrary camera viewpoint. A depth adaptor processes depth values to generate an adapted depth map that bridges domains of rendered and estimated depth maps for the image of the input scene. The adapted depth map, color data, and scene geometry information from an external dataset are provided to a discriminator for selection of a 3D representation of the input scene.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of generating a three-dimensional (3D) object or scene from non-aligned generic camera priors, the method comprising: producing, by a 3D scene generator, a tri-plane representation for an input scene received in random latent code; obtaining, by a camera generator, a camera posterior including posterior parameters representing color and density data from the random latent code and from generic camera priors without alignment assumptions of the generic camera priors; volumetrically rendering, by a volume renderer, an image of the input scene from the color and density data to provide a scene having pixel colors and depth values from an arbitrary camera viewpoint; processing, by a depth adaptor, the depth values to generate an adapted depth map that bridges domains of rendered and estimated depth maps for the image of the input scene; providing the adapted depth map and color data to a discriminator; providing external scene geometry information from an external dataset to the discriminator; and selecting, by the discriminator, a 3D representation of the input scene based on the color data, adapted depth map, and external scene geometry information. 2. The method of claim 1 , wherein obtaining color and density data from the random latent code and generic camera priors comprises using a shallow 2-layer multi-layer perceptron (MLP) decoder to sample arbitrary camera viewpoints captured from ball-in-sphere camera parameterizations provided to the camera generator, the ball-in-sphere camera parameterization having four additional degrees of freedom including a field of view and pitch, yaw and radius of an inner sphere specifying a look-at point within an outer sphere of the ball-in-sphere camera parameterizations. 3. The method of claim 1 , further comprising learning the arbitrary camera viewpoint during training for each input dataset. 4. The method of claim 1 , further comprising pushing derivatives of predicted camera parameters with respect to prior camera parameters to either one or minus one to arrive at a camera gradient penalty L φi : ℒ φ i = ❘ "\[LeftBracketingBar]" ∂ φ i ∂ φ i ′ ❘ "\[RightBracketingBar]" + ❘ "\[LeftBracketingBar]" ∂ φ i ∂ φ i ′ ❘ "\[RightBracketingBar]" - 1 , where φ′ i ∈ φ′ is a camera sampled from a prior camera distribution and φ i ∈ φ is produced by the camera generator. 5. The method of claim 1 , wherein volumetrically rendering the image of the input scene from the color and density data comprises rendering depths d by volumetric rendering as follows: d = ∫ t n t f T ⁡ ( t ) ⁢ σ ⁡ ( r ⁡ ( t ) ) ⁢ tdt , where t n and t f are near/far planes, T(t) is accumulated transmittance, and r(t) is a ray. 6. The method of claim 5 , wherein volumetrically rendering the image of the input scene from the color and density data further comprises shifting and scaling a depth d from a range of [t n , t f ] into [−1, 1] to obtain normalized depth d : d _ = 2 · d - ( t n + t f + b ) / 2 t f - t n - b , where b ∈ [0, (t n +t f )/2] is an additional learnable shift that accounts for empty space in front of a camera. 7. The method of claim 1 , wherein processing the depth values comprises producing the adapted depth map as a function of a normalized depth where the depth values are concatenated with RGB color data input and passed to the discriminator. 8. The method of claim 7 , wherein processing the depth values comprises using a convolutional network to

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12380640B2 cover?
A three-dimensional (3D) scene is generated from non-aligned generic camera priors by producing a tri-plane representation for an input scene received in random latent code, obtaining a camera posterior including posterior parameters representing color and density data from the random latent code and from generic camera priors without alignment assumptions, and volumetrically rendering an image…
Who is the assignee on this patent?
Lee Hsin Ying, Ren Jian, Siarohin Aliaksandr, and 4 more
What technology area does this patent fall under?
Primary CPC classification G06T17/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 05 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).