What technology area does this patent fall under?

Primary CPC classification G06F21/6254. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 15 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Iterative execution of data de-identification processes

US11036886B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11036886-B2
Application number	US-201916447064-A
Country	US
Kind code	B2
Filing date	Jun 20, 2019
Priority date	Feb 26, 2018
Publication date	Jun 15, 2021
Grant date	Jun 15, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer system de-identifies data by selecting one or more attributes of a dataset and determining a set of data de-identification techniques associated with each attribute. Each de-identification technique is evaluated with respect to an impact on data privacy and an impact on data utility based on a series of metrics, and a data de-identification technique is recommended for each attribute based on the evaluation. The dataset is de-identified by applying the de-identification technique that is recommended for each attribute. Embodiments of the present invention further include a method and program product for de-identifying data in substantially the same manner described above.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method, in a data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to de-identify data, the method comprising: selecting a plurality of attributes of a dataset; determining a set of data de-identification techniques, including one or more data de-identification techniques for each attribute of the plurality of attributes; evaluating each data de-identification technique with respect to an impact on data privacy and an impact on data utility based on a series of metrics; recommending a data de-identification technique for each attribute based on the evaluation, wherein the recommended data de-identification technique for each attribute is presented using a user interface dashboard that indicates a plurality of privacy metrics and a plurality of utility metrics resulting from applying the recommended data de-identification technique, and wherein the user interface dashboard indicates the plurality of privacy metrics and the plurality of utility metrics for each attribute of the plurality of attributes; applying the recommended data de-identification technique for each attribute to de-identify the dataset; removing each applied data de-identification technique from the recommended data de-identification techniques to be applied to the dataset; re-evaluating remaining data de-identification techniques for selected attributes of the de-identified data set with respect to an impact on data privacy and an impact on data utility based on the series of metrics; recommending a second data de-identification technique for each selected attribute of the de-identified data set, wherein the recommended second data de-identification technique for each attribute is presented using the user interface dashboard that indicates the plurality of privacy metrics and the plurality of utility metrics resulting from applying the recommended second data de-identification technique, and wherein the user interface dashboard indicates the plurality of privacy metrics and the plurality of utility metrics for each selected attribute of the selected attributes; and applying the recommended second data de-identification technique for each selected attribute of the de-identified data set to further de-identify the dataset. 2. The method of claim 1 , further comprising: presenting the de-identified data set and recommended data de-identification techniques and corresponding configuration options on the user interface dashboard. 3. The method of claim 1 , wherein one or more of the plurality of attributes include a direct identifier associated with one or more of a set of data masking techniques, a set of data pseudonymization techniques, and a set of data encryption techniques. 4. The method of claim 1 , wherein one or more of the plurality of attributes include a set of quasi-identifiers associated with a set of data anonymization techniques. 5. The method of claim 1 , wherein metrics associated with data privacy include probabilistic data linkage against one or more datasets from a group of publicly available datasets and user provided datasets, and uniqueness criteria of the dataset. 6. The method of claim 1 , wherein metrics associated with data utility include data distortion introduced into the dataset by a data de-identification technique and workload-aware metrics that capture usefulness of the de-identified data in supporting certain types of analyses. 7. The method of claim 1 , wherein a metric associated with data utility comprises an average relative error metric.

Assignees

Inventors

Gkoulalas-Divanis Aris

Classifications

G06F21/6254Primary
by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title
G06F21/602
Providing cryptographic facilities or services · CPC title

Patent family

Related publications grouped by family.

View patent family 67683951

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11036886B2 cover?: A computer system de-identifies data by selecting one or more attributes of a dataset and determining a set of data de-identification techniques associated with each attribute. Each de-identification technique is evaluated with respect to an impact on data privacy and an impact on data utility based on a series of metrics, and a data de-identification technique is recommended for each attribute…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F21/6254. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 15 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).