System and methods for diversity auditing

US12159482B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12159482-B2
Application numberUS-202217652026-A
CountryUS
Kind codeB2
Filing dateFeb 22, 2022
Priority dateFeb 22, 2022
Publication dateDec 3, 2024
Grant dateDec 3, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for diversity auditing are described. The systems and methods include identifying a plurality of images; detecting a face in each of the plurality of images using a face detection network; classifying the face in each of the plurality of images based on a sensitive attribute using an image classification network; generating a distribution of the sensitive attribute in the plurality of images based on the classification; and computing a diversity score for the plurality of images based on the distribution.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for data auditing, comprising: identifying a plurality of images; detecting a face in each of the plurality of images using a face detection network; classifying the face in each of the plurality of images based on a sensitive attribute using an image classification network, wherein the sensitive attribute relates to an age, a race, or a gender of the face in each of the plurality of images; generating a distribution of the sensitive attribute in the plurality of images based on the classification; and computing a diversity score for the plurality of images based on the distribution, wherein the diversity score indicates a statistical diversity of the distribution of the sensitive attribute in the plurality of images. 2. The method of claim 1 , further comprising: identifying a website; and collecting the plurality of images from the website. 3. The method of claim 2 , further comprising: performing an image search on the website; and receiving search results for the image search, wherein the plurality of images is collected from the search results. 4. The method of claim 1 , further comprising: ordering the plurality of images based at least in part on the diversity score. 5. The method of claim 1 , further comprising: generating an image feature vector for each of the plurality of images, wherein the classification is based on the image feature vector. 6. The method of claim 1 , further comprising: identifying a comparison population for the plurality of images; and identifying a baseline distribution of the sensitive attribute based on the comparison population, wherein the diversity score is computed by comparing the distribution and the baseline distribution. 7. The method of claim 6 , further comprising: computing a Hellinger distance between the distribution and the baseline distribution, wherein the diversity score is based on the Hellinger distance. 8. The method of claim 6 , further comprising: identifying an additional baseline distribution of the sensitive attribute, wherein the diversity score is computed by comparing the distribution to the baseline distribution and the additional baseline distribution. 9. The method of claim 1 , further comprising: identifying additional images having the sensitive attribute based on the diversity score; and combining the plurality of images with the additional images to obtain a representative set of images. 10. The method of claim 9 , further comprising: generating the additional images using a generative adversarial network (GAN). 11. The method of claim 1 , wherein: the sensitive attribute comprises race, gender, or age. 12. A method for data auditing, comprising: identifying a training set including a plurality of training images and label data identifying a ground truth sensitive attribute of a face in each of the plurality of training images; classifying the face in each of the plurality of images using an image classification network to obtain a predicted sensitive attribute, wherein the predicted sensitive attribute relates to an age, a race, or a gender of the face in each of the plurality of images; updating parameters of the image classification network by comparing the predicted sensitive attribute to the ground truth sensitive attribute; applying the image classification network to a plurality of images to obtain a distribution of a sensitive attribute in the plurality of images; and computing a diversity score for the plurality of images based on the distribution, wherein the diversity score indicates a statistical diversity of the distribution of the sensitive attribute in the plurality of images. 13. The method of claim 12 , further comprising: computing a loss function based on comparing the predicted sensitive attribute to the ground truth sensitive attribute; and computing a gradient of the loss function, wherein the parameters of the image classification network are updated based on the gradient of the loss function. 14. The method of claim 12 , further comprising: training a face detection network to detect the face in each of the plurality of images, wherein the image classification network takes the face in each of the plurality of images as input. 15. The method of claim 12 , further comprising: identifying a comparison population for the plurality of images; and identifying a baseline distribution of the sensitive attribute based on the comparison population, wherein the diversity score is computed by comparing the distribution and the baseline distribution. 16. An apparatus for data auditing, comprising: a face detection network configured to detect a face in each of a plurality of images; an image classification network configured to classify the face in each of the plurality of images based on a sensitive attribute, wherein the sensitive attribute relates to an age, a race, or a gender of the face in each of the plurality of images; a distribution component configured to generate a distribution of the sensitive attribute in the plurality of images based on the classification; and a scoring component configured to compute a diversity score for the plurality of images based on the distribution, wherein the diversity score indicates a statistical diversity of the distribution of the sensitive attribute in the plurality of images. 17. The apparatus of claim 16 , further comprising: an image collection component configured to collect the plurality of images from a website. 18. The apparatus of claim 16 , further comprising: a generator network configured to generate additional images based on the diversity score. 19. The apparatus of claim 16 , wherein: the face detection network comprises a convolutional neural network (CNN) architecture. 20. The apparatus of claim 16 , wherein: the image classification network comprises a residual neural network (ResNet) architecture.

Assignees

Inventors

Classifications

  • G06V40/172Primary

    Classification, e.g. identification · CPC title

  • Detection; Localisation; Normalisation · CPC title

  • G06V10/82Primary

    using neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12159482B2 cover?
Systems and methods for diversity auditing are described. The systems and methods include identifying a plurality of images; detecting a face in each of the plurality of images using a face detection network; classifying the face in each of the plurality of images based on a sensitive attribute using an image classification network; generating a distribution of the sensitive attribute in the pl…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06V40/172. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 03 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).