Differential privacy and outlier detection within a non-interactive model

US10445527B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10445527-B2
Application numberUS-201615387052-A
CountryUS
Kind codeB2
Filing dateDec 21, 2016
Priority dateDec 21, 2016
Publication dateOct 15, 2019
Grant dateOct 15, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system for differential privacy is provided. In some implementations, the system performs operations comprising receiving a plurality of indices for a plurality of perturbed data points, which are anonymized versions of a plurality of unperturbed data points, wherein the plurality of indices indicate that the plurality of unperturbed data points are identified as presumed outliers. The plurality of perturbed data points can lie around a first center point and the plurality of unperturbed data points can lie around a second center point. The operations can further comprise classifying a portion of the presumed outliers as true positives and another portion of the presumed outliers as false positives, based upon differences in distances to the respective first and second center points for the perturbed and corresponding (e.g., same index) unperturbed data points. Related systems, methods, and articles of manufacture are also described.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: at least one processor; and at least one memory storing instructions which, when executed by the at least one processor, cause operations comprising: receiving a plurality of indices for a plurality of perturbed data points from an perturbed data set generated by one or more sensors, wherein the plurality of perturbed data points are anonymized versions of a plurality of unperturbed data points with the same plurality of indices, wherein receiving the plurality of indices indicates that the plurality of unperturbed data points are identified as presumed outliers, wherein the plurality of perturbed data points lie around a first center point, and wherein the plurality of unperturbed data points lie around a second center point; classifying, based upon distance differences, a first portion of the presumed outliers as true positives, wherein the distance differences include, for each of the plurality of perturbed data points, a difference between a distance of the perturbed data point from the first center point and a distance of a corresponding unperturbed data point from the second center point; classifying, based upon the distance differences, a second portion of the presumed outliers as false positives, wherein each presumed outlier is classified as a false positive when a corresponding distance difference for the presumed outlier is less than a threshold distance away from the first center point, and wherein each presumed outlier is classified as a true positive when a corresponding distance difference for the presumed outlier is greater than the threshold distance away from the first center point; and providing, based on the classifying, a list of confirmed outliers. 2. The system of claim 1 , wherein the operations further comprise: determining the threshold distance from the first center point, and wherein the one or more sensors comprise at least one Internet of Things sensor. 3. The system of claim 2 , wherein the operations further comprise: ordering each of the distance differences in an ordered list; identifying a largest change between two consecutive differences in the ordered list; and setting the threshold distance as the largest change. 4. The system of claim 1 , wherein the plurality of unperturbed data points are held among a plurality of sensor devices. 5. The system of claim 1 , wherein the operations further comprise: determining whether any of the distance differences are negative; and classifying each of the plurality of unperturbed data points with a negative distance difference as a false negative. 6. The system of claim 1 , wherein the operations further comprise: identifying a minimum distance difference from among the true positives; receiving indices for a plurality of non-outliers with a distance from the first center point that is greater or equal to the minimum distance; and reclassifying, as false negatives, each of the plurality of non-outliers with a distance difference between zero and the minimum distance difference. 7. The system of claim 1 , wherein the operations further comprise: identifying a minimum distance difference from among the true positives; setting an outer boundary value to equal a sum of the minimum distance difference and a width of an outlier layer, wherein the width of the outlier layer is a difference between an end and a beginning of an area in a coordinate space that includes the outliers; receiving indices for a plurality of non-outliers with a distance from the first center point that is greater or equal to the outer boundary value; and reclassifying, as false negatives, each of the plurality of non-outliers with a distance difference between the threshold distance and the outer boundary. 8. The system of claim 1 , wherein the plurality of perturbed data points are anonymized based upon a noise function, and wherein the noise function comprises a perturbation function. 9. A method for protecting data associated with one or more sensors, the method comprising: receiving, at a processor, a plurality of indices for a plurality of perturbed data points from an perturbed data set generated by one or more sensors, wherein the plurality of perturbed data points are anonymized versions of a plurality of unperturbed data points with the same plurality of indices, wherein receiving the plurality of indices indicates that the plurality of unperturbed data points are identified as presumed outliers, wherein the plurality of perturbed data points lie around a first center point, and wherein the plurality of unperturbed data points lie around a second center point; classifying, by the processor and based upon distance differences, a first portion of the presumed outliers as true positives, wherein the distance differences include, for each of the plurality of perturbed data points, a difference between a distance of the perturbed data point from the first center point and a distance of a corresponding unperturbed data point from the second center point; classifying, by the processor and based upon the distance differences, a second portion of the presumed outliers as false positives, wherein each presumed outlier is classified as a false positive when a corresponding distance difference for the presumed outlier is less than a threshold distance away from the first center point, and wherein each presumed outlier is classified as a true positive when a corresponding distance difference for the presumed outlier is greater than the threshold distance away from the first center point; and providing, by the processor and based on the classifying, a list of confirmed outliers. 10. The method of claim 9 , further comprising: ordering each of the distance differences in an ordered list; identifying a largest change between two consecutive differences in the ordered list; and setting the threshold distance as the largest change, and wherein the one or more sensors comprise at least one Internet of Things sensor. 11. The method of claim 9 , further comprising: determining whether any of the distance differences are negative; and classifying each of the plurality of unperturbed data points with a negative distance difference as a false negative. 12. The method of claim 9 , further comprising: identifying a minimum distance difference from among the true positives; receiving indices for a plurality of non-outliers with a distance from the first center point that is greater or equal to the minimum distance; and reclassifying, as false negatives, each of the plurality of non-outliers with a distance difference between zero and the minimum distance difference. 13. The method of claim 9 , further comprising: identifying a minimum distance difference from among the true positives; setting an outer boundary value to equal a sum of the minimum distance difference and a width of an outlier layer, wherein the width of the outlier layer is a difference between an end and a beginning of an area in a coordinate space that includes the outliers; receiving indices for a plurality of non-outliers with a distance from the first center point that is greater or equal to the outer boundary value; and reclassifying, as false negatives, each of the plurality of non-outliers with a distance difference between the threshold distance and the outer boundary. 14. A non-transitory computer program product storing instructions which, when executed by at least one data processor, causes operations comprising: receiving a plurality of indices for a plurality of perturbed data points from an perturbed data set generated by one or more sens

Assignees

Inventors

Classifications

  • Protecting privacy or anonymity, e.g. protecting personally identifiable information [PII] · CPC title

  • Services for machine-to-machine communication [M2M] or machine type communication [MTC] · CPC title

  • for collecting sensor information · CPC title

  • Anonymous communication, i.e. the party's identifiers are hidden from the other party or parties, e.g. using an anonymizer · CPC title

  • G06F21/604Primary

    Tools and structures for managing or administering access control systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10445527B2 cover?
A system for differential privacy is provided. In some implementations, the system performs operations comprising receiving a plurality of indices for a plurality of perturbed data points, which are anonymized versions of a plurality of unperturbed data points, wherein the plurality of indices indicate that the plurality of unperturbed data points are identified as presumed outliers. The plural…
Who is the assignee on this patent?
Sap Se
What technology area does this patent fall under?
Primary CPC classification G06F21/604. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 15 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).