Anonymizing user identifiable information

US10754985B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10754985-B2
Application numberUS-201815876748-A
CountryUS
Kind codeB2
Filing dateJan 22, 2018
Priority dateFeb 22, 2013
Publication dateAug 25, 2020
Grant dateAug 25, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed techniques provide systems and methods for anonymizing various portions of information, action logs, end-user information, and/or other data sets that are stored in non-indexed storage systems. More specifically, various anonymization procedures are described for redacting UII and/or replacing UII in raw data with randomly generated information (RGI). The anonymization process is performed on a rolling basis as raw data is received. An anonymization mapping table maps (or associates) the replaced UII in the anonymized data to the RGI, and eventually all raw data can be deleted.

First claim

Opening claim text (preview).

What is claimed is: 1. A method performed by a computing system, comprising: receiving, by the computing system, an indication to delete an account associated with a user of a social networking system; identifying, by the computing system, a user identifier (UID) associated with the user, wherein the UID uniquely identifies the user in the social networking system; accessing, by the computing system, an anonymization identification map from computer memory of a data warehouse, wherein the anonymization identification map associates the UID with a randomly generated user identifier (RID) and instances of the UID are replaced by the RID in one or more anonymized data sets in the computer memory of the data warehouse; and disassociating, by the computing system, the UID from the RID in the anonymization identification map. 2. The method of claim 1 further comprising: removing, by the computing system, the UIDs from a list of active UIDs. 3. The method of claim 1 further comprising: storing, by the computing system, the anonymization identification map with the UID disassociated from the RID in the data warehouse. 4. The method of claim 1 , wherein accessing the anonymization identification map includes generating an anonymized data set from a raw data set using the anonymization identification map, the raw data set including one or more instances of the UID. 5. The method of claim 4 , wherein generating the anonymized data set includes: scanning, by the computing system, the raw data set to identify a complex structure; determining, by the computing system, a primary UID associated with the complex structure; parsing, by the computing system, the complex structure to identify a key associated with the UID; identifying, by the computing system, a value associated with the key; and replacing, by the computing system, the value with RID associated with the primary UID. 6. The method of claim 5 further comprising: determining, by the computing system, that the key is another complex structure; parsing, by the computing system, the key to identify an additional key if a max depth threshold is not exceeded; identifying, by the computing system, an additional value associated with the additional key; and replacing, by the computing system, the additional value with RID associated with the primary UID. 7. The method of claim 4 , wherein generating the anonymized data set using the anonymization identification map comprises: identifying, by the computing system, a type of data in a column of the raw data set based on a metadata tag associated with the column; determining, by the computing system, an action associated with the metadata tag; and performing, by the computing system, the action to anonymize the data in the column. 8. The method of claim 7 , wherein performing the action to anonymize the data in the column comprises replacing the one or more instances of UID in the column with an associated RID. 9. The method of claim 7 , wherein performing the action to anonymize the data in the column comprises executing a computer script to sanitize the data. 10. A computer-readable storage medium comprising instructions, which when executed by a processor, cause the processor to perform a method comprising: scanning, by the processor, a non-indexed raw data set from computer memory of a data warehouse to determine a list of user identifiers (UIDs), wherein each UID uniquely identifies a user of a social networking system; generating, by the processor, an anonymization identification map by associating the UIDs with randomly generated user identifiers (RIDs); performing, by the processor, an anonymization process on the raw data set by replacing the UIDs in the raw data set with associated RIDs; and removing, by the processor, an association between a specific UID and an associated RID in response to receiving an indication to delete an account associated with a user of a social networking system, wherein the user of the social networking system is uniquely identified by the specific UID. 11. The computer-readable storage medium of claim 10 , wherein the method further comprises: counting, by the processor, a first number of rows in the raw data set prior to performing the anonymization process, wherein the non-indexed raw data includes a plurality of tables, each table having a plurality of rows; counting, by the processor, a second number of rows in the data set subsequent to performing the anonymization process; and comparing, by the processor, the first number of rows and the second number of rows to determine if the anonymization process exceeds an error threshold. 12. The computer-readable storage medium of claim 10 , wherein performing the anonymization process on the raw data set includes: scanning, by the computing system, the raw data set to identify a complex structure; determining, by the computing system, a primary UID associated with the complex structure; parsing, by the computing system, the complex structure to identify a key associated with the UID; identifying, by the computing system, a value associated with the key; and replacing, by the computing system, the value with RID associated with the primary UID. 13. The computer-readable storage medium of claim 12 , wherein the method further comprises: determining, by the computing system, that the key is another complex structure; parsing, by the computing system, the key to identify an additional key if a max depth threshold is not exceeded; identifying, by the computing system, an additional value associated with the additional key; and replacing, by the computing system, the additional value with RID associated with the primary UID. 14. The computer-readable storage medium of claim 10 , wherein performing the anonymization process on the raw data set includes: identifying, by the computing system, a type of data in a column of the raw data set based on a metadata tag associated with the column; determining, by the computing system, an action associated with the metadata tag; and performing, by the computing system, the action to anonymize the data in the column. 15. The computer-readable storage medium of claim 14 , wherein performing the action to anonymize the data in the column comprises replacing the one or more instances of UID in the column with an associated RID. 16. The computer-readable storage medium of claim 14 , wherein performing the action to anonymize the data in the column comprises executing a computer script to sanitize the data. 17. A system, comprising: a processor; and a memory storing instructions, which when executed by the processor, cause the processor to: receive an indication to delete an account associated with a user of a social networking system, identify a user identifier (UID) associated with the user, wherein the UID uniquely identifies the user in the social networking system, access an anonymization identification map from computer memory of a data warehouse, wherein the anonymization identification map associates the UID with a randomly generated user identifier (RID) and instances of the UID are replaced by the RID in one or more anonymized data sets in the computer memory of the data warehouse, disassociate the UID from the RID in the anonymization identification map, and remove the UIDs from a list of active UIDs. 18. The system of claim 17 , wherein the processor is configured to generate an anonymized data set from a raw data set using the anonymization identification map, the raw data se

Assignees

Inventors

Classifications

  • Business processes related to social networking or social networking services · CPC title

  • Management thereof · CPC title

  • by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title

  • Usage protection of distributed data files · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10754985B2 cover?
The disclosed techniques provide systems and methods for anonymizing various portions of information, action logs, end-user information, and/or other data sets that are stored in non-indexed storage systems. More specifically, various anonymization procedures are described for redacting UII and/or replacing UII in raw data with randomly generated information (RGI). The anonymization process is …
Who is the assignee on this patent?
Facebook Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/6254. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 25 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).