User data deidentification system for ip addresses
US-2024411929-A1 · Dec 12, 2024 · US
US9582680B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9582680-B2 |
| Application number | US-201414168532-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 30, 2014 |
| Priority date | Jan 30, 2014 |
| Publication date | Feb 28, 2017 |
| Grant date | Feb 28, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A personally identifiable information (PII) scrubbing system. The PII scrubbing system surgically scrubs PII form a log based on a scrubber configuration corresponding to the log. The scrubber configuration includes context information about locations and types of PII in the log and rules specifying how to locate and protect the PII. Scrubber configurations are quickly and easily created or modified as scrubbing requirements change or new scenarios are encountered. The flexibility provided by the scrubber configurations allows only the PII to be scrubbed, even from unstructured data, without having to include surrounding data. Many consumers can use the scrubbed data without needed to expose the PII because less non-personal data is obscured. Surgical scrubbing also retains the usefulness of the underlying PII even while protecting the PII. Consumers can correlate the protected PII to locate specific information without having to expose additional PII.
Opening claim text (preview).
What is claimed is: 1. A method of scrubbing a data set having messages containing both non-personal data and personally identifiable information, the method comprising: loading a message containing both non-personal data and personally identifiable information; loading a scrubber configuration containing a rule set for scrubbing the data set; parsing the message into fields based on the rule set, wherein unstructured data fields are formatted and delimiters are added to the unstructured data field such that personally identifiable information is identifiable from unlabeled data; scrubbing only the personally identifiable information in the message based on the rule set to produce a scrubbed message, the personally identifiable information being associated with metadata that identifies a type of personally identifiable information, and applying a corresponding scrubbing rule to the type of personally identifiable information, the corresponding scrubbing rule including: generating replacement values for the personally identifiable information in the message based on the rule set, including generating a replacement value for a first instance of specific personally identifiable information in the message based on the corresponding scrubbing rule and storing a reference to the replacement value associated with the specific personally identifiable information; and substituting replacement values for the personally identifiable information in the message to create the scrubbed message, including retrieving the replacement value associated with the specific personally identifiable information using the reference when additional instances of the specific personally identifiable information are encountered, and using the retrieved replacement value for the additional instances of the specific personally identifiable information; and saving the scrubbed message. 2. The method of claim 1 wherein the rule set comprises a root parsing rule and child rules for scrubbing the data set. 3. The method of claim 2 wherein the act of scrubbing the personally identifiable information in the message based on the rule set to produce a scrubbed message further comprises: parsing the message into fields based on the root parsing rule; and scrubbing the personally identifiable information in selected fields of the message based on the child rules. 4. The method of claim 3 wherein the act of parsing the message into fields based on the root parsing rule further comprises splitting the message into fields based on a delimiter specified in the root parsing rule. 5. The method of claim 3 wherein the act of parsing the message into fields based on the root parsing rule further comprises splitting the message into a predefined set of fields based on a message type specified in the root parsing rule. 6. The method of claim 3 wherein the act of protecting the personally identifiable information in selected fields of the message based on the child rules further comprises the act of applying a filtering rule specified in the child rules to include messages having fields containing personally identifiable information based on a value of a selected field. 7. The method of claim 3 wherein the act of protecting the personally identifiable information in selected fields of the message based on the child rules further comprises the act of applying a filtering rule specified in the child rules to exclude messages not having any fields containing personally identifiable information based on a value of a selected field. 8. The method of claim 3 wherein the act of protecting the personally identifiable information in selected fields of the message based on the child rules further comprises the act of applying a processing rule specified in the child rules to protect personally identifiable information in a selected field specified in the processing rule. 9. The method of claim 8 wherein the act of applying a processing rule specified in the child rules to protect personally identifiable information in a selected field specified in the processing rule further comprises: applying a parsing rule specified in the child rules to search the selected field for personally identifiable information of a type specified the parsing rule; and protecting the personally identifiable information of the specified type found in the selected field. 10. The method of claim 3 wherein the act of protecting the personally identifiable information in selected fields of the message based on the child rules further comprises the act of applying a parsing rule specified in the child rules to separate a selected field specified in the processing rule into sub-fields. 11. The method of claim 10 wherein the act of protecting the personally identifiable information in selected fields of the message based on the child rules further comprises the act of: separating the value of a field into name fields and value fields based on a delimiter pair specified in the child rules; and protecting the personally identifiable information in the value field if found in the selected field; applying a parsing rule specified in the child rules to separate the selected field specified in the parsing rule into sub-fields. 12. The method of claim 1 wherein the act of scrubbing only the personally identifiable information in the message based on the rule set to produce a scrubbed message further comprising: storing a replacement value for each unique instance of personally identifiable information in the message based on the rules from the scrubber configuration; and re-using the replacement value for when duplicate instances of the personally identifiable information. 13. A system for scrubbing personally identifiable information from a message, the system comprising: a processing unit; and a memory including computer executable instructions which, when executed by a processing unit, cause the system to provide: a scrubber configuration including a root parsing rule and a processing rule specifying how to locate and replace the personally identifiable information appearing in the message, the scrubber configuration corresponding to a log containing messages; a scrubbing agent loading the scrubber configuration, the scrubber agent comprising a parsing engine executing the root parsing to separate the message into fields, wherein unstructured data fields are formatted and delimiters are added to the unstructured data field such that personally identifiable information is identifiable from unlabeled data, and a processing engine executing the processing rule to replace the personally identifiable information in a selected field with a replacement value preventing the personally identifiable information from being exposed but allowing specific personally identifiable information to be located by correlation, the personally identifiable information being associated with metadata that identifies a type of personally identifiable information, and applying a corresponding scrubbing rule to the type of personally identifiable information, the corresponding scrubbing rule generating replacement values for the personally identifiable information in the message based on the rule set by generating a replacement value for a first instance of specific personally identifiable information in the message based on the corresponding scrubbing rule and storing a reference to the replacement value associated with the specific personally identifiable information, and substituting replacement values for the personally identifiable information in the message to create the scrubbed message by retrieving the replacement value associated with the specific p
by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title
Protecting personal data, e.g. for financial or medical purposes · CPC title
Search customisation based on user profiles and personalisation · CPC title
involving long-term monitoring or reporting · CPC title
Applying rules; Deductive queries · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.