Automated inspection system
US-2024420305-A1 · Dec 19, 2024 · US
US11748943B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11748943-B2 |
| Application number | US-202117170739-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 8, 2021 |
| Priority date | Mar 31, 2020 |
| Publication date | Sep 5, 2023 |
| Grant date | Sep 5, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An electronic device and method of dataset cleaning is provided. The electronic device receives a dataset comprising a plurality of samples, of which a first sample comprises a 2D image of an object of interest and a 3D shape model of the object of interest. The electronic device determines 2D landmarks from the 2D image and extracts 3D landmarks from the 3D shape model. The electronic device computes an error between the determined 2D landmarks and corresponding 2D locations of the extracted 3D landmarks on the 2D image, based on an error metric. Thereafter, the electronic device determines the computed error to be above a threshold. Based on the determination that the computed error is above the threshold, the electronic device updates the dataset by a removal of the first sample from the dataset and trains a neural network on a task of 3D reconstruction, based on the updated dataset.
Opening claim text (preview).
What is claimed is: 1. An electronic device, comprising: circuitry configured to: receive a dataset comprising a plurality of samples, wherein a first sample of the plurality of samples comprises a first two-dimensional (2D) image of an object of interest, a first three-dimensional (3D) shape model of the object of interest, and texture mapping information between the object of interest of the first 2D image and the first 3D shape model; determine 2D landmarks from the first 2D image, wherein the determined 2D landmarks corresponds to shape-features of the object of interest; extract 3D landmarks from the first 3D shape model of the received dataset; compute a first error between the determined 2D landmarks and corresponding 2D locations of the extracted 3D landmarks on the first 2D image, based on an error metric; determine the computed first error to be above a threshold; update the received dataset by a removal of the first sample from the received dataset, wherein the removal is based on the determination that the computed first error is more than the threshold; and train, based on the updated dataset, a neural network on a task of 3D reconstruction from a single 2D image. 2. The electronic device according to claim 1 , wherein the error metric is a Root Mean Square Error (RMSE) metric. 3. The electronic device according to claim 2 , wherein the first error is a Root Mean Square Error (RMSE) between the determined 2D landmarks and the corresponding 2D locations of the extracted 3D landmarks on the first 2D image. 4. The electronic device according to claim 1 , wherein the trained neural network model: receives, as an input, a second 2D image of the object of interest, wherein the second 2D image is part of an unseen dataset of images of the object of interest; and outputs, based on the input, a second 3D shape model corresponding to the object of interest in the second 2D image. 5. The electronic device according to claim 1 , wherein the object of interest is one of an inanimate object or an animate object. 6. The electronic device according to claim 1 , wherein the object of interest is one of a body of a human subject, a face of the human subject, or an anatomical portion different from the body and the face of the human subject. 7. The electronic device according to claim 1 , wherein each 3D landmark of the extracted 3D landmarks includes 3D coordinates of the shape-features included in the first 3D shape model. 8. A method, comprising: receiving a dataset comprising a plurality of samples, wherein a first sample of the plurality of samples comprises a first two-dimensional (2D) image of an object of interest, a first three-dimensional (3D) shape model of the object of interest, and texture mapping information between the object of interest of the first 2D image and the first 3D shape model; determining 2D landmarks from the first 2D image, wherein the determined 2D landmarks corresponds to shape-features of the object of interest; extracting 3D landmarks from the first 3D shape model of the received dataset; computing a first error between the determined 2D landmarks and corresponding 2D locations of the extracted 3D landmarks on the first 2D image, based on an error metric; determining the computed first error to be above a threshold; updating the received dataset by a removal of the first sample from the received dataset, wherein the removal is based on the determination that the computed first error is more than the threshold; and training, based on the updated dataset, a neural network on a task of 3D reconstruction from a single 2D image. 9. The method according to claim 8 , wherein the error metric is a Root Mean Square Error (RMSE) metric. 10. The method according to claim 9 , wherein the first error is a Root Mean Square Error (RMSE) between the determined 2D landmarks and the corresponding 2D locations of the extracted 3D landmarks on the first 2D image. 11. The method according to claim 8 , further comprising: receiving, by the trained neural network model, a second 2D image of the object of interest as an input, wherein the second 2D image is part of an unseen dataset of images of the object of interest; and outputting, by the trained neural network model, a second 3D shape model corresponding to the object of interest in the second 2D image, based on the input. 12. The method according to claim 8 , wherein the object of interest is one of an inanimate object or an animate object. 13. The method according to claim 8 , wherein the object of interest is one of a body of a human subject, a face of the human subject, or an anatomical portion different from the body and the face of the human subject. 14. The method according to claim 8 , wherein each 3D landmark of the extracted 3D landmarks includes 3D coordinates of the shape-features included in the first 3D shape model. 15. A non-transitory computer-readable medium having stored thereon computer implemented instructions that, when executed by an electronic device, causes the electronic device to execute operations, the operations comprising: receiving a dataset comprising a plurality of samples, wherein a first sample of the plurality of samples comprises a first two-dimensional (2D) image of an object of interest, a first three-dimensional (3D) shape model of the object of interest, and texture mapping information between the object of interest of the first 2D image and the first 3D shape model; determining 2D landmarks from the first 2D image, wherein the determined 2D landmarks corresponds to shape-features of the object of interest; extracting 3D landmarks from the first 3D shape model of the received dataset; computing a first error between the determined 2D landmarks and corresponding 2D locations of the extracted 3D landmarks on the first 2D image, based on an error metric; determining the computed first error to be above a threshold; updating the received dataset by a removal of the first sample from the received dataset, wherein the removal is based on the determination that the computed first error is more than the threshold; and training, based on the updated dataset, a neural network on a task of 3D reconstruction from a single 2D image. 16. The non-transitory computer-readable medium according to claim 15 , wherein the error metric is a Root Mean Square Error (RMSE) metric. 17. The non-transitory computer-readable medium according to claim 16 , wherein the first error is a Root Mean Square Error (RMSE) between the determined 2D landmarks and the corresponding 2D locations of the extracted 3D landmarks on the first 2D image. 18. The non-transitory computer-readable medium according to claim 15 , wherein the operations further comprise: receiving, by the trained neural network model, a second 2D image of the object of interest as an input, wherein the second 2D image is part of an unseen dataset of images of the object of interest; and outputting, by the trained neural network model, a second 3D shape model corresponding to the object of interest in the second 2D image, based on the input. 19. The non-transitory computer-readable medium according to claim 15 , wherein the object of interest is one of an inanimate object or an animate object. 20. The non-transitory computer-readable medium according to claim 15 , wherein the object of interest is one of a body of a human subject, a face of the human subject, or an anatomical portion different from the body and the face of the human subjec
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Generative networks · CPC title
Three-dimensional [3D] modelling for computer graphics · CPC title
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.