Automated data anonymization
US-10963590-B1 · Mar 30, 2021 · US
US11775684B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11775684-B2 |
| Application number | US-202217888908-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 16, 2022 |
| Priority date | May 16, 2018 |
| Publication date | Oct 3, 2023 |
| Grant date | Oct 3, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A rule-based attribution mechanism analyzes documents having different types of data in different formats through the application of script-based rules that apply a tag to the document identifying the type of sensitive data that is contained in the document. Documents having similar tags are aggregated so that the sensitive data is scrubbed from the document leaving the telemetric data available for downstream processing. The scrubbing entails different actions, such as, eliminating the sensitive data, obfuscating the sensitive data, and converting the sensitive data into a non-sensitive value.
Opening claim text (preview).
What is claimed: 1. A system comprising: one or more processors; and a memory; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs include instructions that: obtain a plurality of documents having telemetric data and sensitive data, a first set of the plurality of documents having a plurality of fields arranged in a first format, a second set of the plurality of documents having a plurality of fields arranged in a second format, wherein the first format and the second format differ; access a script having a plurality of rules, a rule identifying a select one of the plurality of fields of a select one of the plurality of documents of a specific format as sensitive data and including a scrubbing action; apply the plurality of rules of the script to each of the plurality of documents to identify sensitive data and to associate a scrubbing action for the identified sensitive data; tag each of the plurality of documents with a tag indicating the scrubbing action from the application of the plurality of rules; aggregate select ones of the plurality of documents tagged with a similar tag; perform a select scrubbing action associated with the similar tag to each of the selected aggregated documents; and process the telemetric data without the sensitive data. 2. The system of claim 1 , wherein the telemetric data includes an event field that identifies an event that triggered collection of the telemetric data; and wherein at least one of the plurality of rules identifies the sensitive data based on the event field. 3. The system of claim 1 , wherein the telemetric data includes a condition that specifies circumstances in which the tag is applied; and wherein at least one of the plurality of rules identifies the sensitive data based on the condition being satisfied. 4. The system of claim 1 , wherein the scrubbing action deletes the identified sensitive data. 5. The system of claim 1 , wherein the scrubbing action obfuscates the identified sensitive data using a simple hash value. 6. The system of claim 1 , wherein the scrubbing action converts the identified sensitive data into a non-sensitive value. 7. The system of claim 1 , wherein the scrubbing action obfuscates the identified sensitive data using a rolling hash value. 8. The system of claim 1 , wherein the telemetric data is collected during engagement of a software product. 9. A computer-implemented method, comprising: accessing a plurality of documents including telemetric data generated from events occurring during execution of one or more software products, wherein the telemetric data includes a plurality of fields, a select one of the fields containing an event triggering collection of the telemetric data; obtaining a rule-based script having a plurality of rules, a rule identifying sensitive data in at least one field of the plurality of fields of the plurality of documents and a scrubbing action for the identified sensitive data; applying the rule-based script to each of the plurality of documents to identify fields containing the sensitive data; tagging select ones of the plurality of documents with a tag based on the applied rule-based script, wherein the tag identifies a scrubbing action; aggregating the selected ones of the plurality of documents having a common tag; performing the scrubbing action of the common tag to the sensitive data; and processing the telemetric data without the scrubbed sensitive data. 10. The computer-implemented method of claim 9 , wherein the scrubbing action deletes the identified sensitive data. 11. The computer-implemented method of claim 9 , wherein the scrubbing action obfuscates the identified sensitive data using a simple hash value. 12. The computer-implemented method of claim 9 , wherein the scrubbing action converts the identified sensitive data into a non-sensitive value. 13. The computer-implemented method of claim 9 , wherein the scrubbing action obfuscates the identified sensitive data using a rolling hash value. 14. The computer-implemented method of claim 9 , wherein at least one of the plurality of rules identifies the sensitive data based on the event field. 15. The computer-implemented method of claim 9 , wherein at least one of the plurality of rules identifies the sensitive data based on a condition in the telemetric data being satisfied.
by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title
Protecting personal data, e.g. for financial or medical purposes · CPC title
Hash functions, e.g. MD5, SHA, HMAC or f9 MAC · CPC title
Anonymous communication, i.e. the party's identifiers are hidden from the other party or parties, e.g. using an anonymizer · CPC title
Providing cryptographic facilities or services · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.