Rendering Video Of A Scene Using Three-Dimensional Gaussians

US2026024268A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2026024268-A1
Application numberUS-202418779232-A
CountryUS
Kind codeA1
Filing dateJul 22, 2024
Priority dateJul 22, 2024
Publication dateJan 22, 2026
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A set of images of a scene re received. Each image includes temporal data and spatial data relating to the scene. Based on the spatial data of each image, three-dimensional (3D) Gaussian splatting data is generated. The temporal data of each image and the 3D Gaussian splatting data are inputted to a neural network to generate spatial-temporal 3D Gaussian embeddings. Offset data based on the spatial-temporal 3D Gaussian embeddings is generated. The video of the scene is rendered based on the 3D Gaussian splatting data and the offset data, allowing for improved rendering of video of the scene.

First claim

Opening claim text (preview).

1 . A method of rendering video of a scene, comprising: receiving a set of images of the scene, wherein each image comprises temporal data and spatial data relating to the scene; generating, based on the spatial data of each image, three-dimensional (3D) Gaussian splatting data; inputting the temporal data of each image and the 3D Gaussian splatting data to a neural network to generate spatial-temporal 3D Gaussian embeddings; generating offset data based on the spatial-temporal 3D Gaussian embeddings; and rendering the video of the scene based on the 3D Gaussian splatting data and the offset data. 2 . The method of claim 1 , wherein rendering the video of the scene comprises: combining the 3D Gaussian splatting data with the offset data to generate spatial-temporal 3D Gaussian representations of the scene; and rendering the video of the scene by inputting the spatial-temporal 3D Gaussian representations to a rasterizer. 3 . The method of claim 1 , wherein generating the 3D Gaussian splatting data comprises generating the 3D Gaussian splatting data using 3D point cloud reconstruction. 4 . The method of claim 1 , wherein generating the 3D Gaussian splatting data comprises inputting the spatial data of each image to a machine learning model trained to generate 3D Gaussian splatting data based on spatial data from one or more images. 5 . The method of claim 1 , wherein: each image further comprises viewpoint data of the scene; and generating the offset data is further based on the viewpoint data. 6 . The method of claim 5 , wherein generating the offset data comprises inputting the viewpoint data to a neural network to generate one or more spherical harmonics offset parameters. 7 . The method of claim 1 , wherein the neural network is a multi-layer perceptron. 8 . The method of claim 1 , wherein generating the offset data comprises inputting the spatial-temporal 3D Gaussian embeddings to one or more neural networks. 9 . The method of claim 8 , wherein at least one of the one or more neural networks is a multi-layer perceptron. 10 . The method of claim 8 , wherein each of the one or more neural networks is a multi-layer perceptron. 11 . The method of claim 1 , wherein: generating the 3D Gaussian splatting data comprises generating one or more of: one or more 3D Gaussian position parameters; one or more 3D Gaussian scale parameters; one or more 3D Gaussian rotation parameters; and one or more 3D Gaussian opacity parameters; and generating the offset data comprises inputting one or more of: the one or more 3D Gaussian position parameters to a neural network to generate one or more position offset parameters; the one or more 3D Gaussian scale parameters to a neural network to generate one or more scale offset parameters; the one or more 3D Gaussian rotation parameters to a neural network to generate one or more rotation offset parameters; and the one or more 3D Gaussian opacity parameters to a neural network to generate one or more opacity offset parameters. 12 . The method of claim 1 , wherein generating the 3D Gaussian splatting data comprises: identifying, within the spatial data of each image: foreground spatial data relating a foreground of the scene; and background spatial data relating a background of the scene; generating, based on the background spatial data, background 3D Gaussian splatting data; and generating, based on the foreground spatial data, foreground 3D Gaussian splatting data. 13 . The method of claim 12 , wherein generating the spatial-temporal 3D Gaussian embeddings comprises: generating background spatial-temporal 3D Gaussian embeddings based on the temporal data of each image and the background 3D Gaussian splatting data; and generating foreground spatial-temporal 3D Gaussian embeddings based on the temporal data of each image and the foreground 3D Gaussian splatting data. 14 . The method of claim 13 , wherein generating the offset data comprises: generating background offset data based on the background spatial-temporal 3D Gaussian embeddings; and generating foreground offset data based on the foreground spatial-temporal 3D Gaussian embeddings. 15 . The method of claim 14 , wherein rendering the video of the scene comprises: combining the background 3D Gaussian splatting data with the background offset data to generate spatial-temporal 3D Gaussian representations of the background of the scene; combining the foreground 3D Gaussian splatting data with the foreground offset data to generate spatial-temporal 3D Gaussian representations of the foreground of the scene; and rendering the video of the scene by inputting the spatial-temporal 3D Gaussian representations of the background and the foreground of the scene to the rasterizer. 16 . The method of claim 14 , wherein: generating the background offset data comprises inputting the background spatial-temporal 3D Gaussian embeddings to a single neural network to generate the background offset data; and generating the foreground offset data comprises inputting the foreground spatial-temporal 3D Gaussian embeddings to a single neural network to generate the foreground offset data. 17 . The method of claim 1 , wherein generating the offset data comprises inputting the spatial-temporal 3D Gaussian embeddings to a single neural network to generate the offset data. 18 . A non-transitory, computer-readable medium storing computer program code configured, when executed by one or more processors, to cause the one or more processors to perform a method comprising: receiving a set of images of a scene, wherein each image comprises temporal data and spatial data relating to the scene; generating, based on the spatial data of each image, three-dimensional (3D) Gaussian splatting data; inputting the temporal data of each image and the 3D Gaussian splatting data to a neural network to generate spatial-temporal 3D Gaussian embeddings; generating offset data based on the spatial-temporal 3D Gaussian embeddings; and rendering the video of the scene based on the 3D Gaussian splatting data and the offset data. 19 . A computing device comprising one or more graphics processors operable to render video of a scene by: receiving a set of images of a scene, wherein each image comprises temporal data and spatial data relating to the scene; generating, based on the spatial data of each image, three-dimensional (3D) Gaussian splatting data; inputting the temporal data of each image and the 3D Gaussian splatting data to a neural network to generate spatial-temporal 3D Gaussian embeddings; generating offset data based on the spatial-temporal 3D Gaussian embeddings; and rendering the video of the scene based on the 3D Gaussian splatting data and the offset data.

Assignees

Inventors

Classifications

  • Three-dimensional [3D] animation · CPC title

  • Perspective computation · CPC title

  • Particle system, point based geometry or rendering · CPC title

  • G06T15/08Primary

    Volume rendering · CPC title

  • Navigation within 3D models or images · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2026024268A1 cover?
A set of images of a scene re received. Each image includes temporal data and spatial data relating to the scene. Based on the spatial data of each image, three-dimensional (3D) Gaussian splatting data is generated. The temporal data of each image and the 3D Gaussian splatting data are inputted to a neural network to generate spatial-temporal 3D Gaussian embeddings. Offset data based on the spa…
Who is the assignee on this patent?
Shenzhen Yinwang Intelligent Technology Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06T15/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 22 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).