Image processing system, image processing method, and program
US-2020184651-A1 · Jun 11, 2020 · US
US11423616B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11423616-B1 |
| Application number | US-202016833360-A |
| Country | US |
| Kind code | B1 |
| Filing date | Mar 27, 2020 |
| Priority date | Mar 27, 2020 |
| Publication date | Aug 23, 2022 |
| Grant date | Aug 23, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In one embodiment, a system may access an input image of an object captured by cameras, and the input image depicts appearance information associated with an object. The system may generate a first mesh of the object based on features identified from the input image of the object. The system may generate, by processing the first mesh using a machine-learning model, a position map that defines a contour of the object. Each pixel in the position map corresponds to a three-dimensional coordinate. The system may further generate a second mesh based on the position map, wherein the second mesh has a higher resolution than the first mesh. The system may render an output image of the object based on the second mesh. The system disclosed in the present application can render a dense mesh which has a higher resolution to provide details which cannot be compensated by texture information.
Opening claim text (preview).
What is claimed is: 1. A method comprising, by a computing system: accessing one or more input images of an object captured by one or more cameras, the one or more input images depicting appearance information associated with the object; generating a first mesh of the object based on features identified from depth measurements of the one or more input images of the object; generating, by processing the first mesh and an initial texture of the object using a machine-learning model, (a) a position map that is a two-dimensional image defining an unwrapped geometry of the object, (b) an intermediate texture of the object, and (c) a warp field, wherein each pixel in the position map specifies a corresponding three-dimensional location of the object using a three-dimensional coordinate; generating a second mesh based on the position map, wherein the second mesh has a higher resolution than the first mesh; generating an output texture corresponding to the second mesh by warping the intermediate texture using the warp field; and rendering an output image of the object based on the second mesh and the output texture. 2. The method of claim 1 , wherein generating the second mesh based on the position map comprises: sampling pixels in the position map based on geometry; determining vertices from each of the sampled pixels in the position map; and generating the second mesh based on the determined vertices. 3. The method of claim 1 , wherein the machine-learning model is trained by: comparing the output image of the object with the one or more input images of the object; and calculating an image loss based on the comparison to update the machine-learning model. 4. The method of claim 1 , wherein the machine-learning model is trained by: measuring input depth measurements in the one or more input images of the object; computing output depth measurements in the output image of the object; comparing the output depth measurements with the input depth measurements; and calculating an output depth loss based on the comparison to update the machine-learning model. 5. The method of claim 1 , wherein the machine-learning model is trained by: measuring input depth measurements in the one or more input images of the object; smoothing depth in the output image of the object to obtain a curvature of the object; computing smoothed depth measurements in the output image of the object; comparing the smoothed depth measurements with the input depth measurements; and calculating a normal loss based on the comparison to update the machine-learning model. 6. The method of claim 1 , wherein the machine-learning model is trained by: comparing each of the features in the one or more input images of the object with its corresponding feature in the output image of the object; and calculating a tracking loss based on the comparison to update the machine-learning model. 7. The method of claim 1 , wherein the machine-learning model is configured to generate images for television monitors, cinema screens, computer monitors, mobile phones, or tablets. 8. One or more computer-readable non-transitory storage media embodying software that is operable when executed to: access one or more input images of an object captured by one or more cameras, the one or more input images depicting appearance information associated with the object; generate a first mesh of the object based on features identified from depth measurements of the one or more input images of the object; generate, by processing the first mesh and an initial texture of the object using a machine-learning model, (a) a position map that is a two-dimensional image defining an unwrapped geometry of the object, (b) an intermediate texture of the object, and (c) a warp field, wherein each pixel in the position map specifies a corresponding three-dimensional location of the object using a three-dimensional coordinate; generate a second mesh based on the position map, wherein the second mesh has a higher resolution than the first mesh; generate an output texture corresponding to the second mesh by warping the intermediate texture using the warp field; and render an output image of the object based on the second mesh and the output texture. 9. The media of claim 8 , wherein generating the second mesh based on the position map comprises: sampling pixels in the position map based on geometry; determining vertices from each of the sampled pixels in the position map; and generating the second mesh based on the determined vertices. 10. The media of claim 8 , wherein the machine-learning model is trained by: comparing the output image of the object with the one or more input images of the object; and calculating an image loss based on the comparison to update the machine-learning model. 11. The media of claim 8 , wherein the machine-learning model is trained by: measuring input depth measurements in the one or more input images of the object; computing output depth measurements in the output image of the object; comparing the output depth measurements with the input depth measurements; and calculating an output depth loss based on the comparison to update the machine-learning model. 12. The media of claim 8 , wherein the machine-learning model is trained by: measuring input depth measurements in the one or more input images of the object; smoothing depth in the output image of the object to obtain a curvature of the object; computing smoothed depth measurements in the output image of the object; comparing the smoothed depth measurements with the input depth measurements; and calculating a normal loss based on the comparison to update the machine-learning model. 13. The media of claim 8 , wherein the machine-learning model is trained by: comparing each of the features in the one or more input images of the object with its corresponding feature in the output image of the object; and calculating a tracking loss based on the comparison to update the machine-learning model. 14. The media of claim 8 , wherein the machine-learning model is configured to generate images for television monitors, cinema screens, computer monitors, mobile phones, or tablets. 15. A system comprising: one or more processors; and one or more computer-readable non-transitory storage media coupled to one or more of the processors and comprising instructions operable when executed by one or more of the processors to cause the system to: access one or more input images of an object captured by one or more cameras, the one or more input images depicting appearance information associated with the object; generate a first mesh of the object based on features identified from depth measurements of the one or more input images of the object; generate, by processing the first mesh and an initial texture of the object using a machine-learning model, (a) a position map that is a two-dimensional image defining an unwrapped geometry of the object, (b) an intermediate texture of the object, and (c) a warp field, wherein each pixel in the position map specifies a corresponding three-dimensional location of the object using a three-dimensional coordinate; generate a second mesh based on the position map, wherein the second mesh has a higher resolution than the first mesh; generate an output texture corresponding to the second mesh by warping the intermediate texture using the warp field; and render an output image of the object based on the second mesh and the output texture. 16. The system of claim 15 , wherein generating the second mesh based on the position ma
Finite element generation, e.g. wire-frame surface description, {tesselation} · CPC title
Texture mapping · CPC title
Adversarial learning · CPC title
Generative networks · CPC title
Supervised learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.