3D modeling based on neural light field

US12450822B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12450822-B2
Application numberUS-202418644653-A
CountryUS
Kind codeB2
Filing dateApr 24, 2024
Priority dateMar 28, 2022
Publication dateOct 21, 2025
Grant dateOct 21, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, by one or more processors, a set of images representing a first view of a scene; applying, by the one or more processors, a machine learning model comprising a neural light field network to the set of images to predict pixel values of a target image, the pixel values of the target image being predicted without processing multiple points along a camera ray directed to a given pixel value; and generating, by the one or more processors, a model of the scene based on the set of images and the target image. 2. The method of claim 1 , wherein the machine learning model comprises a deep residual Multi-Layer Perceptron (MLP) network, the target image representing a second view of the scene, and the machine learning model being trained to map a ray origin and direction directly to the given pixel value. 3. The method of claim 1 , the set of images comprising 2D images and the model comprising a 3D model, further comprising: selecting a ray origin and direction associated with a second view of the scene; using a given 2D image of the set of 2D images representing the first view of the scene, generating a ray based on the ray origin and direction corresponding to the second view of the scene; and sampling a plurality of points along the ray. 4. The method of claim 3 , wherein the plurality of points is spaced evenly along the ray. 5. The method of claim 3 , wherein the plurality of points is randomly sampled based on stratified sampling during training of the machine learning model. 6. The method of claim 3 , further comprising: concatenating the plurality of points to generate input data; and processing the input data with the machine learning model to predict one of the pixel values of the target image. 7. The method of claim 1 , further comprising training the machine learning model by performing operations comprising: receiving training data comprising training images and associated camera poses, the training images and associated camera poses being associated with a plurality of training ray origins and normalized ray directions and corresponding ground-truth pixel values; obtaining a training ray origin and normalized ray direction of a first training image of the training images and associated camera pose; applying the machine learning model to a set of points along a training ray formed by the training ray origin and normalized ray direction to predict a training pixel value; retrieving a ground-truth pixel value associated with the first training image; computing a deviation between the predicted training pixel value and the ground-truth pixel value; and updating one or more parameters of the machine learning model based on the deviation. 8. The method of claim 1 , wherein the machine learning model comprises a first machine learning model, further comprising applying a second machine learning model to a collection of 2D images to generate training data used to train the machine learning model. 9. The method of claim 8 , wherein the second machine learning model comprises a trained neural radiance field network. 10. The method of claim 9 , wherein the trained neural radiance field network generates an output representing radiance of a sampled point corresponding to a particular ray, wherein a pixel value of the particular ray is generated through alpha-composition of a plurality of points including the sampled point along the particular ray corresponding to respective radiance values. 11. The method of claim 10 , further comprising: receiving a first 2D image of the collection of 2D images; based on the first 2D image, selecting a ray origin and normalized direction of a camera pose associated with a second 2D image associated with a different camera pose than the first 2D image; applying the second machine learning model to the ray origin and normalized direction of the camera pose associated with the second 2D image to predict a pixel value for the second 2D image, wherein applying the second machine learning model comprises: applying the second machine learning model to each of a plurality of points along a ray formed by the ray origin and normalized direction of the camera pose associated with the second 2D image to generate a plurality of radiance values; and performing alpha-composition of the plurality of radiance values to predict the pixel value for the second 2D image, wherein the ray origin and normalized direction of the camera pose is randomly selected based on a uniform distribution. 12. The method of claim 8 , wherein the training data excludes the collection of 2D images. 13. The method of claim 7 , further comprising: generating a plurality of losses for the training data; sorting the plurality of losses in ascending order; and identifying a quantity of samples in the sorted plurality of losses that transgress a specified threshold. 14. The method of claim 13 , further comprising augmenting the training data with additional training data corresponding to the quantity of samples. 15. The method of claim 1 , further comprising displaying a virtual element associated with an augmented reality or virtual reality experience on a user device within a video comprising the set of images based on the model of the scene. 16. A system comprising: at least one processor; and a memory component having instructions stored thereon that, when executed by the at least one processor, cause the at least one processor to perform operations comprising: receiving a set of images representing a first view of a scene; applying a machine learning model comprising a neural light field network to the set of images to predict pixel values of a target image, the pixel values of the target image being predicted without processing multiple points along a camera ray directed to a given pixel value; and generating a model of the scene based on the set of images and the target image. 17. The system of claim 16 , wherein the pixel values of the target image are predicted without by the machine learning model without determining radiance values of points along one or more rays. 18. The system of claim 17 , the pixel values of the target image being predicted without processing the set of images with a Neural Radiance Field (NeRF) network. 19. The system of claim 18 , wherein the NeRF network is used to generate training data for the machine learning model comprising pseudo-generated training images representing different camera poses of an individual scene. 20. A non-transitory computer-readable storage medium having stored thereon instructions that, when executed by at least one processor, cause the at least one processor to perform operations comprising: receiving a set of images representing a first view of a scene; applying a machine learning model comprising a neural light field network to the set of images to predict pixel values of a target image, the pixel values of the target image being predicted without processing multiple points along a camera ray directed to a given pixel value; and generating a model of the scene based on the set of images and the target image.

Assignees

Inventors

Classifications

  • Artificial neural networks [ANN] · CPC title

  • Training; Learning · CPC title

  • Determining parameters from multiple pictures (depth or shape recovery from multiple images G06T7/55; stereo camera calibration G06T7/85) · CPC title

  • G06T15/06Primary

    Ray-tracing · CPC title

  • G06T17/00Primary

    Three-dimensional [3D] modelling for computer graphics · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12450822B2 cover?
Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the…
Who is the assignee on this patent?
Snap Inc
What technology area does this patent fall under?
Primary CPC classification G06T15/06. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 21 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).