Neural Super-sampling for Real-time Rendering
US-2022277421-A1 · Sep 1, 2022 · US
US11644685B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11644685-B2 |
| Application number | US-202016993788-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 14, 2020 |
| Priority date | Aug 14, 2020 |
| Publication date | May 9, 2023 |
| Grant date | May 9, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In one embodiment, a method includes accessing a pair of stereo images for a scene, where each image of the pair of stereo images has incomplete pixel information and k channels, stacking the pair of stereo images to form a stacked input image with 2k channels, processing the stacked input image using a machine-learning model to generate a stacked output image with 2k channels, and separating the stacked output image with 2k channels into a pair of reconstructed stereo images for the scene, where each image of the pair of reconstructed stereo images has complete pixel information and k channels.
Opening claim text (preview).
What is claimed is: 1. A method comprising, by a computing device: accessing a pair of stereo images for a scene, wherein each image of the pair of stereo images has incomplete pixel information and k channels; stacking the pair of stereo images to form a stacked input image with 2k channels by: calculating an importance score associated with each area among a plurality of areas in the scene; identifying an area with a highest importance score among the plurality of areas in the scene; and stacking the channels of both images by aligning the identified area between the pair of stereo images; processing the stacked input image using a machine-learning model to generate a stacked output image with 2k channels; and separating the stacked output image with 2k channels into a pair of reconstructed stereo images for the scene, wherein each image of the pair of reconstructed stereo images has complete pixel information and k channels. 2. The method of claim 1 , wherein the pair of stereo images is used to provide a stereoscopic view of the scene to a user. 3. The method of claim 1 , wherein an object captured in one of the pair of stereo images is shifted from the other image, wherein a degree of the shift is associated with a distance of the object from a viewpoint of a user. 4. The method of claim 1 , wherein stacking the pair of stereo images to form the stacked input image with 2k channels comprises stacking the channels of both images by aligning pixel coordinates between the pair of stereo images. 5. The method of claim 1 , wherein calculating the importance score associated with each area is based on a relative distance of the area from a vergence location of a user such that a higher importance score is assigned to a first area with a smaller distance to the vergence location of the user than a second area with a larger distance to the vergence location of the user. 6. The method of claim 1 , wherein calculating the importance score associated with each area is based on content associated each area such that a higher importance score is assigned to a first area that is associated with an important content than a second area that is not associated with an important content. 7. The method of claim 1 , wherein the k channels comprise RGB channels. 8. The method of claim 1 , wherein the k channels comprise RGB channels and an alpha channel, wherein the alpha channel indicates a transparency level of each pixel. 9. The method of claim 1 , wherein the pair of stereo images is associated with a frame in a video stream. 10. The method of claim 1 , wherein the machine-learning model is an image reconstruction model that reconstructs restores sampled, noisy or damaged images. 11. The method of claim 1 , wherein the machine-learning model is trained with a loss function that measures differences between each image of the pair of reconstructed stereo images and a corresponding image of a pair of ground truth stereo images. 12. One or more computer-readable non-transitory storage media embodying software that is operable when executed to: access a pair of stereo images for a scene, wherein each image of the pair of stereo images has incomplete pixel information and k channels; stack the pair of stereo images to form a stacked input image with 2k channels by: calculating an importance score associated with each area among a plurality of areas in the scene; identifying an area with a highest importance score among the plurality of areas in the scene; and stacking the channels of both images by aligning the identified area between the pair of stereo images; process the stacked input image using a machine-learning model to generate a stacked output image with 2k channels; and separate the stacked output image with 2k channels into a pair of reconstructed stereo images for the scene, wherein each image of the pair of reconstructed stereo images has complete pixel information and k channels. 13. The media of claim 12 , wherein the pair of stereo images is used to provide a stereoscopic view of the scene to a user. 14. The media of claim 12 , wherein an object captured in one of the pair of stereo images is shifted from the other image, wherein a degree of the shift is associated with a distance of the object from a viewpoint of a user. 15. The media of claim 12 , wherein stacking the pair of stereo images to form the stacked input image with 2k channels comprises stacking the channels of both images by aligning pixel coordinates between the pair of stereo images. 16. The media of claim 12 , wherein calculating the importance score associated with each area is based on a relative distance of the area from a vergence location of a user such that a higher importance score is assigned to a first area with a smaller distance to the vergence location of the user than a second area with a larger distance to the vergence location of the user. 17. The media of claim 12 , wherein calculating the importance score associated with each area is based on content associated each area such that a higher importance score is assigned to a first area that is associated with an important content than a second area that is not associated with an important content. 18. A system comprising: one or more processors; and a non-transitory memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions to: access a pair of stereo images for a scene, wherein each image of the pair of stereo images has incomplete pixel information and k channels; stack the pair of stereo images to form a stacked input image with 2k channels by: calculating an importance score associated with each area among a plurality of areas in the scene; identifying an area with a highest importance score among the plurality of areas in the scene; and stacking the channels of both images by aligning the identified area between the pair of stereo images; process the stacked input image using a machine-learning model to generate a stacked output image with 2k channels; and separate the stacked output image with 2k channels into a pair of reconstructed stereo images for the scene, wherein each image of the pair of reconstructed stereo images has complete pixel information and k channels.
the unit being a colour or a chrominance component · CPC title
based on super-resolution, i.e. the output image resolution being higher than the sensor resolution · CPC title
characterised by optical features · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.