On-demand de-identification of data in computer storage systems
US-2019303610-A1 · Oct 3, 2019 · US
US11449635B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11449635-B2 |
| Application number | US-201916408143-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 9, 2019 |
| Priority date | May 16, 2018 |
| Publication date | Sep 20, 2022 |
| Grant date | Sep 20, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A rule-based attribution mechanism analyzes documents having different types of data in different formats through the application of script-based rules that apply a tag to the document identifying the type of sensitive data that is contained in the document. Documents having similar tags are aggregated so that the sensitive data is scrubbed from the document leaving the telemetric data available for downstream processing. The scrubbing entails different actions, such as, eliminating the sensitive data, obfuscating the sensitive data, and converting the sensitive data into a non-sensitive value.
Opening claim text (preview).
What is claimed: 1. A method, including: receiving a current document at a computing device, the computing device having at least one processor communicatively coupled to a memory, the current document containing telemetric data and unscrubbed sensitive data; applying a tag to the current document to denote an attribute that identifies a scrubbing action to be performed to the unscrubbed sensitive data, the tag based on a field in the current document satisfying a rule and condition for being classified as sensitive data; generating a first obfuscated value for the unscrubbed sensitive data; searching a table of obfuscated values for the first obfuscated value; upon the applied tag identifying a rolling hash scrubbing action and upon a search of the table finding the first obfuscated value, replacing the unscrubbed sensitive data in the current document with a second obfuscated value, the first obfuscated value differs from the second obfuscated value; and analyzing the telemetric data without the unscrubbed sensitive data. 2. The method of claim 1 , further comprising: upon the applied tag identifying a field in the current document as an identifier associated with a software product, replacing the identifier with a one-way hash. 3. The method of claim 1 , further comprising: upon the applied tag identifying a field in the current document as a geolocation, converting the field to a non-sensitive location. 4. The method of claim 1 , further comprising: upon the applied tag identifying a field in the current document as an IP address, converting the field to a name of a service provider. 5. The method of claim 1 , further comprising: upon the applied tag identifying a field in the current document as an email address, removing the email address from the current document. 6. The method of claim 1 , further comprising: upon the applied tag identifying a field in the current document as a machine name, removing the machine name from the current document. 7. The method of claim 1 , further comprising: upon the applied tag identifying a field in the current document as a project identifier, obfuscating the value of the project identifier with a hashed value. 8. The method of claim 1 , further comprising: upon the applied tag identifying a field in the current document as a correlation identifier, obfuscating the value of the correlation identifier with a hashed value. 9. The method of claim 1 , wherein the sensitive data that was previously replaced includes a MAC address hash. 10. A system, comprising: a processor and a memory; wherein the memory includes instructions that when executed on the processor perform acts that: obtain a current document including telemetric data and unscrubbed sensitive data; apply a tag to the current document to denote an attribute that identifies a scrubbing action to be performed to the unscrubbed sensitive data, the tag based on a field in the current document satisfying a rule and condition for being classified as sensitive data; generate a first obfuscated value for the unscrubbed sensitive data; search a table of obfuscated values for the first obfuscated value; upon the applied tag identifying a rolling hash scrubbing action and upon a search of the table finding the first obfuscated value, replace the unscrubbed sensitive data in the current document with a second obfuscated value, the first obfuscated value differs from the second obfuscated value; and analyze the telemetric data without the unscrubbed sensitive data. 11. The system of claim 10 , wherein the memory includes further instructions that when executed on the processor perform acts that: upon the applied tag identifying a field in the current document as an identifier associated with a software product, replace the identifier with a one-way hash. 12. The system of claim 10 , wherein the memory includes further instructions that when executed on the processor perform acts that: upon the applied tag identifying a field in the current document as a geolocation, convert the field to a non-sensitive location. 13. The system of claim 10 , wherein the memory includes further instructions that when executed on the processor perform acts that: upon the applied tag identifying a field in the current document as an IP address, convert the field to a name of a service provider. 14. The system of claim 10 , wherein the memory includes further instructions that when executed on the processor perform acts that: upon the applied tag identifying a field in the current document as an email address, remove the email address from the current document. 15. The system of claim 10 , wherein the memory includes further instructions that when executed on the processor perform acts that: upon the applied tag identifying a field in the current document as a machine name, remove the machine name from the current document. 16. The system of claim 10 , wherein the memory includes further instructions that when executed on the processor perform acts that: upon the applied tag identifying a field in the current document as a project identifier, obfuscate the value of the project identifier with a hashed value. 17. The system of claim 10 , wherein the memory includes further instructions that when executed on the processor perform acts that: upon the applied tag identifying a field in the current document as a correlation identifier, obfuscate the value of the correlation identifier with a hashed value. 18. The system of claim 10 , wherein the sensitive data that was previously replaced includes a MAC address hash.
Hash functions, e.g. MD5, SHA, HMAC or f9 MAC · CPC title
by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title
Anonymous communication, i.e. the party's identifiers are hidden from the other party or parties, e.g. using an anonymizer · CPC title
Providing cryptographic facilities or services · CPC title
Protecting personal data, e.g. for financial or medical purposes · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.