What technology area does this patent fall under?

Primary CPC classification G06T15/06. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 04 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

3D modeling based on neural light field

US12002146B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12002146-B2
Application number	US-202217656778-A
Country	US
Kind code	B2
Filing date	Mar 28, 2022
Priority date	Mar 28, 2022
Publication date	Jun 4, 2024
Grant date	Jun 4, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, by one or more processors, a set of two-dimensional (2D) images representing a first view of a real-world environment; applying, by the one or more processors, a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value, the pixel values of the target image being predicted without processing multiple points along a camera ray directed to the given pixel value; and generating, by the one or more processors, a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image. 2. The method of claim 1 , wherein the machine learning model comprises a deep residual Multi-Layer Perceptron (MLP) network. 3. The method of claim 1 , further comprising: selecting a ray origin and direction associated with the second view of the real-world environment; using a given 2D image of the set of 2D images representing the first view of the real-world environment, generating a ray based on the ray origin and direction corresponding to the second view of the real-world environment; and sampling a plurality of points along the ray. 4. The method of claim 3 , wherein the plurality of points is spaced evenly along the ray. 5. The method of claim 3 , wherein the plurality of points is randomly sampled based on stratified sampling during training of the machine learning model. 6. The method of claim 3 , further comprising: concatenating the plurality of points to generate input data; and processing the input data with the machine learning model to predict one of the pixel values of the 2D target image. 7. The method of claim 1 , further comprising training the machine learning model by performing operations comprising: receiving training data comprising training images and associated camera poses, the training images and associated camera poses being associated with a plurality of training ray origins and normalized ray directions and corresponding ground-truth pixel values; obtaining the training ray origin and normalized ray direction of a first training image of the training images and associated camera pose; applying the machine learning model to a set of points along a training ray formed by the training ray origin and normalized ray direction to predict a training pixel value; retrieving the ground-truth pixel value associated with the first training image; computing a deviation between the predicted training pixel value and the ground-truth pixel value; and updating one or more parameters of the machine learning model based on the deviation. 8. The method of claim 1 , wherein the machine learning model comprises a first machine learning model, further comprising applying a second machine learning model to a collection of 2D images to generate training data used to train the machine learning model. 9. The method of claim 8 , wherein the second machine learning model comprises a trained neural radiance field network. 10. The method of claim 9 , wherein the neural radiance field network generates an output representing radiance of a sampled point corresponding to a particular ray, wherein a pixel value of the particular ray is generated through alpha-composition of a plurality of points including the sampled point along the particular ray corresponding to respective radiance values. 11. The method of claim 10 , further comprising: receiving a first 2D image of the collection of 2D images; based on the first 2D image, selecting a ray origin and normalized direction of a camera pose associated with a second 2D image associated with a different camera pose than the first 2D image; applying the second machine learning model to the ray origin and normalized direction of the camera pose associated with the second 2D image to predict a pixel value for the second 2D image, wherein applying the second machine learning model comprises: applying the second machine learning model to each of a plurality of points along a ray formed by the ray origin and normalized direction of the camera pose associated with the second 2D image to generate a plurality of radiance values; and performing alpha-composition of the plurality of radiance values to predict the pixel value for the second 2D image, wherein the ray origin and normalized direction of the camera pose is randomly selected based on a uniform distribution. 12. The method of claim 8 , wherein the training data excludes the collection of 2D images. 13. The method of claim 7 , further comprising: generating a plurality of losses for the training data; sorting the plurality of losses in ascending order; and identifying a quantity of samples in the sorted plurality of losses that transgress a specified threshold. 14. The method of claim 13 , further comprising augmenting the training data with additional training data corresponding to the quantity of samples. 15. The method of claim 1 , further comprising displaying a virtual element associated with an augmented reality or virtual reality experience on a client device within a video comprising the set of 2D images based on the 3D model of the real-world environment. 16. A system comprising: at least one processor of a device; and a memory component having instructions stored thereon that, when executed by the at least one processor, cause the at least one processor to perform operations comprising: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value, the pixel values of the target image being predicted without processing multiple points along a camera ray directed to the given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image. 17. The system of claim 16 , wherein the pixel values of the target image are predicted without by the machine learning model without determining radiance values of points along one or more rays. 18. The system of claim 17 , the pixel values of the target image being predicted without processing the set of 2D images with a Neural Radiance Field (NeRF) network. 19. The system of claim 18 , wherein the NeRF network is used to generate training data for the machine learning model comprising pseudo-generated training images representing different camera poses of an individual real-world environment. 20. A non-transitory computer-readable storage medium having stored thereon instructions that, when executed by at least one processor of a device, cause the at least one processor to perform operations comprising: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value, the p

Assignees

Snap Inc

Inventors

Classifications

G06T2207/20084
Artificial neural networks [ANN] · CPC title
G06T2207/20081
Training; Learning · CPC title
G06T15/06Primary
Ray-tracing · CPC title
G06T7/97
Determining parameters from multiple pictures (depth or shape recovery from multiple images G06T7/55; stereo camera calibration G06T7/85) · CPC title
G06T17/00Primary
Three-dimensional [3D] modelling for computer graphics · CPC title

Patent family

Related publications grouped by family.

View patent family 86100158

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12002146B2 cover?: Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the…
Who is the assignee on this patent?: Snap Inc
What technology area does this patent fall under?: Primary CPC classification G06T15/06. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 04 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).