De-identification of data

US9323949B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9323949-B2
Application numberUS-201213529294-A
CountryUS
Kind codeB2
Filing dateJun 21, 2012
Priority dateDec 14, 2010
Publication dateApr 26, 2016
Grant dateApr 26, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention relates to a method, computer program product and system for de-identifying data, wherein a de-identification protocol is selectively mapped to a business rule at runtime via an ETL tool.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of de-identifying data from a data source for a target application, the method comprising: generating, via a hardware processor, a default rule set including at least one rule, the default rule set including a default de-identification protocol to produce de-identified data from an Extract/Transform/Load (ETL) tool, wherein the default de-identification protocol is selected based on business rules; mapping, via a hardware processor, the default rule set to data definitions each generated by a discovery tool and associated with a corresponding sensitive data element identified in the data; specifying, via a hardware processor, a runtime rule set comprising at least one runtime rule, the runtime rule including a runtime de-identification protocol to produce de-identified data from the ETL tool, wherein the runtime rule set is specified via an interface; replacing, via a hardware processor, the default rule set with the runtime rule set to change the default de-identification protocol to the runtime de-identification protocol at runtime to accommodate changing de-identification requirements of a target environment, and mapping the runtime rule set to the data definitions, wherein each data definition includes a data object comprising metadata, including an indicator of a type of sensitive data from among a plurality of types of sensitive data and information indicating the location of the data element within the data source, for that data element, and each runtime rule is mapped to a corresponding data definition of a sensitive data element based on the type of sensitive data; and receiving, via a hardware processor, the data and the data definitions, and for each data definition: obtaining the runtime rule mapped to that data definition; and applying the obtained runtime rule to the sensitive data element corresponding to that data definition in the received data and dynamically de-identifying the sensitive data element for the target application by the ETL tool at runtime via the runtime de-identification protocol of the obtained runtime rule. 2. The computer-implemented method of claim 1 , further comprising: consuming the generated data definitions and applying the default de-identification protocol mapped to the data definition of the sensitive data element. 3. The computer-implemented method of claim 2 further comprising: comparing the output of applying the default de-identification protocol with the output of applying the runtime de-identification protocol; and displaying the comparison for review. 4. The computer-implemented method of claim 1 further comprising selectively re-identifying the de-identified data element in accordance with rules to produce an unmasked data element. 5. The computer-implemented method of claim 1 , wherein the replacing further comprises: overriding the generated default rule set with the runtime rule set, wherein the default rule set and the runtime rule set correspond to different target environments having different de-identification requirements. 6. The computer-implemented method of claim 1 further comprising specifying the runtime rule set by designating a file location for the runtime rule set via the interface. 7. The computer implemented method of claim 1 further comprising specifying the runtime rule set by entering the runtime rule set into a text box provided via the interface. 8. The computer implemented method of claim 1 , wherein each data definition is in the form of an Extensible Markup Language (XML) file. 9. A computer-implemented method of de-identifying data from a data source for a target application, the method comprising: identifying sensitive data elements in the data via a discovery tool, wherein identifying a sensitive data element comprises associating the data element with a type of sensitive data from among a plurality of types of sensitive data; generating data definitions via the discovery tool, wherein each data definition is associated with an identified sensitive data element and includes a data object comprising metadata, including an indicator of a type of sensitive data and information indicating the location of the data element within the data source, for that data element; specifying, via a hardware processor, a default rule set comprising at least one runtime rule, the default rule set including a default de-identification protocol to produce de-identified data from an Extract/Transform/Load (ETL) tool, wherein the default de-identification protocol is selected based on business rules; mapping, via a hardware processor, the default rule set to the data definitions generated by the discovery tool for the identified sensitive data elements; replacing, via a hardware processor, the default rule set with a runtime rule set comprising at least one runtime rule, the runtime rule including a runtime de-identification protocol to produce de-identified data from the ETL tool, wherein the runtime rule set is specified via an interface and the replacing changes the default de-identification protocol to the runtime de-identification protocol at runtime to accommodate changing de-identification requirements of a target environment; mapping, via a hardware processor, the runtime rule set to the data definitions generated by the discovery tool and associated with a corresponding sensitive data element identified in the data, wherein each runtime rule is mapped to a corresponding data definition of a sensitive data element based on the type of sensitive data; and receiving, via a hardware processor, the data and the data definitions, and for each data definition: obtaining the runtime rule mapped to that data definition; and applying the obtained runtime rule to the sensitive data element corresponding to that data definition in the received data and dynamically de-identifying the sensitive data element for the target application by the ETL tool at runtime via the runtime de-identification protocol of the obtained runtime rule.

Assignees

Inventors

Classifications

  • by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9323949B2 cover?
The present invention relates to a method, computer program product and system for de-identifying data, wherein a de-identification protocol is selectively mapped to a business rule at runtime via an ETL tool.
Who is the assignee on this patent?
Gupta Ritesh K, Nagaraj Prathima, Padmanabhan Sriram K, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F21/6254. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 26 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).