Image processing apparatus, image processing method, and storage medium
US-2024428519-A1 · Dec 26, 2024 · US
US12530839B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12530839-B2 |
| Application number | US-202318505009-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 8, 2023 |
| Priority date | Nov 11, 2022 |
| Publication date | Jan 20, 2026 |
| Grant date | Jan 20, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention sets forth a technique for generating two-dimensional (2D) renderings of a three-dimensional (3D) scene from an arbitrary camera position under arbitrary lighting conditions. This technique includes determining, based on a plurality of 2D representations of a 3D scene, a radiance field function for a neural radiance field (NeRF) model. This technique further includes determining, based on a plurality of 2D representations of a 3D scene, a radiance field function for a “one light at a time” (OLAT) model. The technique further includes rendering a 2D representation of the scene based on a given camera position and illumination data. The technique further includes computing a rendering loss based on the difference between the rendered 2D representation and an associated one of the plurality of 2D representations of the scene. The technique further includes modifying at least one of the NeRF and OLAT models based on the rendering loss.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method for performing scene rendering, the computer-implemented method comprising: determining, for each of a plurality of three-dimensional (3D) locations in a 3D scene via execution of a first trained machine learning model, a density value associated with the 3D location; determining, for each of the plurality of 3D locations via execution of a second trained machine learning model, a diffuse color value and a specular color value associated with the 3D location based on a given camera location and a given lighting map; determining a pixel color value for each of a plurality of pixels in a two-dimensional (2D) representation of the scene based on the density values, the diffuse color values, and the specular color values associated with the plurality of 3D locations; and generating, based on the pixel color values associated with the plurality of pixels, a 2D rendering of the scene. 2 . The computer-implemented method of claim 1 , further comprising: generating, using the first trained machine learning model, a neural feature vector based on the 3D location; and transmitting the neural feature vector to the second trained machine learning model. 3 . The computer-implemented method of claim 1 , wherein the first trained machine learning model is a neural radiance field (NeRF) model and the second trained machine learning model is a “one light at a time” (OLAT) model. 4 . The computer-implemented method of claim 1 , wherein the lighting map includes position information associated with each of a plurality of light sources. 5 . The computer-implemented method of claim 4 , further comprising: generating, based on the 3D location and the lighting map, a light direction that represents a direction from one of the plurality of light sources to the 3D location in the scene; calculating, using a spherical codebook, an OLAT code based on the light direction; and transmitting the OLAT code to the second trained machine learning model, wherein the second trained machine learning model is a “one light at a time” (OLAT) model. 6 . The computer-implemented method of claim 5 , wherein calculating the OLAT code further comprises multiplying a spherical harmonic parameterization of the light direction by a matrix of learned coefficients included in the spherical codebook. 7 . The computer-implemented method of claim 5 , wherein for each 3D location, the steps of generating the light direction, calculating the OLAT code based on the light direction, and transmitting the OLAT code to the second trained machine learning model are repeated for each of the plurality of light sources included in the lighting map. 8 . One or more non-transitory computer-readable media storing instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of: determining, for each of a plurality of three-dimensional (3D) locations in a 3D scene via execution of a first trained machine learning model, a density value associated with the 3D location; determining, for each of the plurality of 3D locations via execution of a second trained machine learning model, a diffuse color value and a specular color value associated with the 3D location based on a given camera location and a given lighting map; determining a pixel color value for each of a plurality of pixels in a two-dimensional (2D) representation of the scene based on the density values, diffuse color values, and specular color values associated with the plurality of 3D locations; and generating, based on the pixel color values associated with the plurality of pixels, a 2D rendering of the scene. 9 . The one or more non-transitory computer-readable media of claim 8 , wherein the instructions further cause the one or more processors to perform the steps of: generating, using the first trained machine learning model, a neural feature vector based on the 3D location; and transmitting the neural feature vector to the second trained machine learning model. 10 . The one or more non-transitory computer-readable media of claim 8 , wherein the first trained machine learning model is a neural radiance field (NeRF) model and the second trained machine learning model is a one light at a time (OLAT) model. 11 . The one or more non-transitory computer-readable media of claim 8 , wherein the lighting map includes position information associated with each of a plurality of light sources. 12 . The one or more non-transitory computer-readable media of claim 11 , wherein the instructions further cause the one or more processors to perform the steps of: generating, based on the 3D location and the lighting map, a light direction that represents a direction from one of the plurality of light sources included in the lighting map to the 3D location in the scene; calculating, using a spherical codebook, an OLAT code based on the light direction; and transmitting the OLAT code to the second trained machine learning model. 13 . The one or more non-transitory computer-readable media of claim 12 , wherein calculating the OLAT code further comprises multiplying a spherical harmonic parameterization the light direction by a matrix of learned coefficients included in the spherical codebook. 14 . The one or more non-transitory computer-readable media of claim 12 , wherein for each 3D location, the steps of generating the light direction, calculating the OLAT code based on the light direction, and transmitting the OLAT code to the second trained machine learning model are repeated for each of the plurality of light sources included in the lighting map. 15 . A computer-implemented method for performing scene rendering, the computer-implemented method comprising: determining, based on a plurality of two-dimensional (2D) representations of a three-dimensional (3D) scene, a first radiance field function associated with a first machine learning model; determining, based on the plurality of 2D representations of the 3D scene, a second radiance field function associated with a second machine learning model; generating a combined radiance field function based on the radiance field functions associated with the first machine learning model and the second machine learning model; generating, based on the combined radiance field function, a color value for a pixel in a 2D rendering of the scene; computing a rendering loss based on a difference between the color value for the pixel and a ground truth color value associated with a corresponding pixel in a corresponding one of the plurality of 2D representations; and modifying at least one of the first machine learning model or the second machine learning model based on the rendering loss. 16 . The computer-implemented method of claim 15 , further comprising: generating, using the first machine learning model, a neural feature vector based on a 3D location in the 3D scene; and transmitting the neural feature vector to the second machine learning model. 17 . The computer-implemented method of claim 15 , wherein the first machine learning model is a neural radiance field (NeRF) model and the second machine learning model is a “one light at a time” (OLAT) model. 18 . The computer-implemented method of claim 15 , wherein the rendering loss is a mean-squared error measurement of the difference between the color value for the pixel and the ground truth color value. 19 . The computer-implemented method of claim 15 , further comprising iteratively training the
Color image · CPC title
Colour editing, changing, or manipulating; Use of colour codes · CPC title
Training; Learning · CPC title
Artificial neural networks [ANN] · CPC title
Collision detection, intersection · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.