Cleaning dataset for neural network training

US11748943B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11748943-B2
Application numberUS-202117170739-A
CountryUS
Kind codeB2
Filing dateFeb 8, 2021
Priority dateMar 31, 2020
Publication dateSep 5, 2023
Grant dateSep 5, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An electronic device and method of dataset cleaning is provided. The electronic device receives a dataset comprising a plurality of samples, of which a first sample comprises a 2D image of an object of interest and a 3D shape model of the object of interest. The electronic device determines 2D landmarks from the 2D image and extracts 3D landmarks from the 3D shape model. The electronic device computes an error between the determined 2D landmarks and corresponding 2D locations of the extracted 3D landmarks on the 2D image, based on an error metric. Thereafter, the electronic device determines the computed error to be above a threshold. Based on the determination that the computed error is above the threshold, the electronic device updates the dataset by a removal of the first sample from the dataset and trains a neural network on a task of 3D reconstruction, based on the updated dataset.

First claim

Opening claim text (preview).

What is claimed is: 1. An electronic device, comprising: circuitry configured to: receive a dataset comprising a plurality of samples, wherein a first sample of the plurality of samples comprises a first two-dimensional (2D) image of an object of interest, a first three-dimensional (3D) shape model of the object of interest, and texture mapping information between the object of interest of the first 2D image and the first 3D shape model; determine 2D landmarks from the first 2D image, wherein the determined 2D landmarks corresponds to shape-features of the object of interest; extract 3D landmarks from the first 3D shape model of the received dataset; compute a first error between the determined 2D landmarks and corresponding 2D locations of the extracted 3D landmarks on the first 2D image, based on an error metric; determine the computed first error to be above a threshold; update the received dataset by a removal of the first sample from the received dataset, wherein the removal is based on the determination that the computed first error is more than the threshold; and train, based on the updated dataset, a neural network on a task of 3D reconstruction from a single 2D image. 2. The electronic device according to claim 1 , wherein the error metric is a Root Mean Square Error (RMSE) metric. 3. The electronic device according to claim 2 , wherein the first error is a Root Mean Square Error (RMSE) between the determined 2D landmarks and the corresponding 2D locations of the extracted 3D landmarks on the first 2D image. 4. The electronic device according to claim 1 , wherein the trained neural network model: receives, as an input, a second 2D image of the object of interest, wherein the second 2D image is part of an unseen dataset of images of the object of interest; and outputs, based on the input, a second 3D shape model corresponding to the object of interest in the second 2D image. 5. The electronic device according to claim 1 , wherein the object of interest is one of an inanimate object or an animate object. 6. The electronic device according to claim 1 , wherein the object of interest is one of a body of a human subject, a face of the human subject, or an anatomical portion different from the body and the face of the human subject. 7. The electronic device according to claim 1 , wherein each 3D landmark of the extracted 3D landmarks includes 3D coordinates of the shape-features included in the first 3D shape model. 8. A method, comprising: receiving a dataset comprising a plurality of samples, wherein a first sample of the plurality of samples comprises a first two-dimensional (2D) image of an object of interest, a first three-dimensional (3D) shape model of the object of interest, and texture mapping information between the object of interest of the first 2D image and the first 3D shape model; determining 2D landmarks from the first 2D image, wherein the determined 2D landmarks corresponds to shape-features of the object of interest; extracting 3D landmarks from the first 3D shape model of the received dataset; computing a first error between the determined 2D landmarks and corresponding 2D locations of the extracted 3D landmarks on the first 2D image, based on an error metric; determining the computed first error to be above a threshold; updating the received dataset by a removal of the first sample from the received dataset, wherein the removal is based on the determination that the computed first error is more than the threshold; and training, based on the updated dataset, a neural network on a task of 3D reconstruction from a single 2D image. 9. The method according to claim 8 , wherein the error metric is a Root Mean Square Error (RMSE) metric. 10. The method according to claim 9 , wherein the first error is a Root Mean Square Error (RMSE) between the determined 2D landmarks and the corresponding 2D locations of the extracted 3D landmarks on the first 2D image. 11. The method according to claim 8 , further comprising: receiving, by the trained neural network model, a second 2D image of the object of interest as an input, wherein the second 2D image is part of an unseen dataset of images of the object of interest; and outputting, by the trained neural network model, a second 3D shape model corresponding to the object of interest in the second 2D image, based on the input. 12. The method according to claim 8 , wherein the object of interest is one of an inanimate object or an animate object. 13. The method according to claim 8 , wherein the object of interest is one of a body of a human subject, a face of the human subject, or an anatomical portion different from the body and the face of the human subject. 14. The method according to claim 8 , wherein each 3D landmark of the extracted 3D landmarks includes 3D coordinates of the shape-features included in the first 3D shape model. 15. A non-transitory computer-readable medium having stored thereon computer implemented instructions that, when executed by an electronic device, causes the electronic device to execute operations, the operations comprising: receiving a dataset comprising a plurality of samples, wherein a first sample of the plurality of samples comprises a first two-dimensional (2D) image of an object of interest, a first three-dimensional (3D) shape model of the object of interest, and texture mapping information between the object of interest of the first 2D image and the first 3D shape model; determining 2D landmarks from the first 2D image, wherein the determined 2D landmarks corresponds to shape-features of the object of interest; extracting 3D landmarks from the first 3D shape model of the received dataset; computing a first error between the determined 2D landmarks and corresponding 2D locations of the extracted 3D landmarks on the first 2D image, based on an error metric; determining the computed first error to be above a threshold; updating the received dataset by a removal of the first sample from the received dataset, wherein the removal is based on the determination that the computed first error is more than the threshold; and training, based on the updated dataset, a neural network on a task of 3D reconstruction from a single 2D image. 16. The non-transitory computer-readable medium according to claim 15 , wherein the error metric is a Root Mean Square Error (RMSE) metric. 17. The non-transitory computer-readable medium according to claim 16 , wherein the first error is a Root Mean Square Error (RMSE) between the determined 2D landmarks and the corresponding 2D locations of the extracted 3D landmarks on the first 2D image. 18. The non-transitory computer-readable medium according to claim 15 , wherein the operations further comprise: receiving, by the trained neural network model, a second 2D image of the object of interest as an input, wherein the second 2D image is part of an unseen dataset of images of the object of interest; and outputting, by the trained neural network model, a second 3D shape model corresponding to the object of interest in the second 2D image, based on the input. 19. The non-transitory computer-readable medium according to claim 15 , wherein the object of interest is one of an inanimate object or an animate object. 20. The non-transitory computer-readable medium according to claim 15 , wherein the object of interest is one of a body of a human subject, a face of the human subject, or an anatomical portion different from the body and the face of the human subjec

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Generative networks · CPC title

  • G06T17/00Primary

    Three-dimensional [3D] modelling for computer graphics · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11748943B2 cover?
An electronic device and method of dataset cleaning is provided. The electronic device receives a dataset comprising a plurality of samples, of which a first sample comprises a 2D image of an object of interest and a 3D shape model of the object of interest. The electronic device determines 2D landmarks from the 2D image and extracts 3D landmarks from the 3D shape model. The electronic device c…
Who is the assignee on this patent?
Sony Group Corp
What technology area does this patent fall under?
Primary CPC classification G06T17/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 05 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).