Method and an apparatus for evaluating generative machine learning model
US-2019012581-A1 · Jan 10, 2019 · US
US12159482B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12159482-B2 |
| Application number | US-202217652026-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 22, 2022 |
| Priority date | Feb 22, 2022 |
| Publication date | Dec 3, 2024 |
| Grant date | Dec 3, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for diversity auditing are described. The systems and methods include identifying a plurality of images; detecting a face in each of the plurality of images using a face detection network; classifying the face in each of the plurality of images based on a sensitive attribute using an image classification network; generating a distribution of the sensitive attribute in the plurality of images based on the classification; and computing a diversity score for the plurality of images based on the distribution.
Opening claim text (preview).
What is claimed is: 1. A method for data auditing, comprising: identifying a plurality of images; detecting a face in each of the plurality of images using a face detection network; classifying the face in each of the plurality of images based on a sensitive attribute using an image classification network, wherein the sensitive attribute relates to an age, a race, or a gender of the face in each of the plurality of images; generating a distribution of the sensitive attribute in the plurality of images based on the classification; and computing a diversity score for the plurality of images based on the distribution, wherein the diversity score indicates a statistical diversity of the distribution of the sensitive attribute in the plurality of images. 2. The method of claim 1 , further comprising: identifying a website; and collecting the plurality of images from the website. 3. The method of claim 2 , further comprising: performing an image search on the website; and receiving search results for the image search, wherein the plurality of images is collected from the search results. 4. The method of claim 1 , further comprising: ordering the plurality of images based at least in part on the diversity score. 5. The method of claim 1 , further comprising: generating an image feature vector for each of the plurality of images, wherein the classification is based on the image feature vector. 6. The method of claim 1 , further comprising: identifying a comparison population for the plurality of images; and identifying a baseline distribution of the sensitive attribute based on the comparison population, wherein the diversity score is computed by comparing the distribution and the baseline distribution. 7. The method of claim 6 , further comprising: computing a Hellinger distance between the distribution and the baseline distribution, wherein the diversity score is based on the Hellinger distance. 8. The method of claim 6 , further comprising: identifying an additional baseline distribution of the sensitive attribute, wherein the diversity score is computed by comparing the distribution to the baseline distribution and the additional baseline distribution. 9. The method of claim 1 , further comprising: identifying additional images having the sensitive attribute based on the diversity score; and combining the plurality of images with the additional images to obtain a representative set of images. 10. The method of claim 9 , further comprising: generating the additional images using a generative adversarial network (GAN). 11. The method of claim 1 , wherein: the sensitive attribute comprises race, gender, or age. 12. A method for data auditing, comprising: identifying a training set including a plurality of training images and label data identifying a ground truth sensitive attribute of a face in each of the plurality of training images; classifying the face in each of the plurality of images using an image classification network to obtain a predicted sensitive attribute, wherein the predicted sensitive attribute relates to an age, a race, or a gender of the face in each of the plurality of images; updating parameters of the image classification network by comparing the predicted sensitive attribute to the ground truth sensitive attribute; applying the image classification network to a plurality of images to obtain a distribution of a sensitive attribute in the plurality of images; and computing a diversity score for the plurality of images based on the distribution, wherein the diversity score indicates a statistical diversity of the distribution of the sensitive attribute in the plurality of images. 13. The method of claim 12 , further comprising: computing a loss function based on comparing the predicted sensitive attribute to the ground truth sensitive attribute; and computing a gradient of the loss function, wherein the parameters of the image classification network are updated based on the gradient of the loss function. 14. The method of claim 12 , further comprising: training a face detection network to detect the face in each of the plurality of images, wherein the image classification network takes the face in each of the plurality of images as input. 15. The method of claim 12 , further comprising: identifying a comparison population for the plurality of images; and identifying a baseline distribution of the sensitive attribute based on the comparison population, wherein the diversity score is computed by comparing the distribution and the baseline distribution. 16. An apparatus for data auditing, comprising: a face detection network configured to detect a face in each of a plurality of images; an image classification network configured to classify the face in each of the plurality of images based on a sensitive attribute, wherein the sensitive attribute relates to an age, a race, or a gender of the face in each of the plurality of images; a distribution component configured to generate a distribution of the sensitive attribute in the plurality of images based on the classification; and a scoring component configured to compute a diversity score for the plurality of images based on the distribution, wherein the diversity score indicates a statistical diversity of the distribution of the sensitive attribute in the plurality of images. 17. The apparatus of claim 16 , further comprising: an image collection component configured to collect the plurality of images from a website. 18. The apparatus of claim 16 , further comprising: a generator network configured to generate additional images based on the diversity score. 19. The apparatus of claim 16 , wherein: the face detection network comprises a convolutional neural network (CNN) architecture. 20. The apparatus of claim 16 , wherein: the image classification network comprises a residual neural network (ResNet) architecture.
Classification, e.g. identification · CPC title
Detection; Localisation; Normalisation · CPC title
using neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.