Generating obfuscated data

US10102398B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10102398-B2
Application numberUS-49735409-A
CountryUS
Kind codeB2
Filing dateJul 2, 2009
Priority dateJun 1, 2009
Publication dateOct 16, 2018
Grant dateOct 16, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for obfuscating data includes: reading values occurring in one or more fields of multiple records from a data source; storing a key value; for each of multiple of the records, generating an obfuscated value to replace an original value in a given field of the record using the key value such that the obfuscated value depends on the key value and is deterministically related to the original value; and storing the collection of obfuscated data including records that include obfuscated values in a data storage system.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method for obfuscating data in records of a data source, the method including: storing, by one or more data processors, a key value for a set of records from a source of the set of records, the set of records being partitioned into multiple sets of records based on a given field of records in the set of records; reading an original value in the given field of the record from a given one of the multiple sets of records, generating, by one or more data processors, an index value by applying the same stored key value and the read original value from that field to a deterministic mathematical function to produce the index value; applying the index value to a lookup table of predetermined obfuscation values to retrieve from the lookup table an obfuscation value; and replacing the original value in that given field of the record with the obfuscation value to obfuscate the original value of that given field; and storing, by one or more data processors, the given one of the multiple sets of records having remaining original values in fields and obfuscation values in those given fields in a data storage system, as an obfuscated set of records. 2. The method of claim 1 , wherein the one or more data processors are plural data processors further includes: processing the multiple sets of records by the plural data processors according to a dataflow graph that provides reformat components that receive as input fields of the given record and the stored key value, and the original value in the given field of the record with the obfuscation value and outputs the reformatted record at its output port, and with the plural processors assigned to different ones of the sets of multiple records using parallel processing to execute the reformat components according to the dataflow graph. 3. The method of claim 1 wherein the deterministic mathematical function includes at least one of a cryptographic hash function that generates the index value, a non-cryptographic function that generates a selection value that is used as an input to a cryptographic hash function to provide the index value, a non-cryptographic function that generates a selection value that is used as an input to a cryptographic hash function to provide the index value or combining the original value and the same key using a cryptographic hash function to yield the index value. 4. The method of claim 2 , wherein multiple index values within a range correspond to the same obfuscated value in the predetermined set of obfuscated values. 5. The method of claim 4 , further including storing profile information including statistics characterizing values of at least one of the fields, wherein the size of the range is based on the statistics in the stored profile information characterizing values of the given field. 6. The method of claim 1 , wherein the deterministic function produces an intermediate selection value that is used to provide the index value, and is a cryptographic hash function. 7. The method of claim 6 , wherein the selection value is mapped to the obfuscated value using a deterministic mapping. 8. The method of claim 6 , wherein a domain of values from which the obfuscated value is selected includes multiple of the original values in the given field of the records from the data source. 9. The method of claim 8 , wherein one or more of the original values are not included in the domain of values. 10. The method of claim 9 , wherein one or more of the values in the domain of values are not included in the original values. 11. The method of claim 1 , wherein the cryptographic hash function prevents recovery of the original value from the obfuscated value using the key. 12. The method of claim 1 , wherein the key is provided from different sequences of selection values. 13. The method of claim 12 , wherein a first sequence of selection values for consecutive original values for a first value of the key is not predictable from a second sequence of selection values for consecutive original values for a second value of the key. 14. The method of claim 12 , further includes: determining whether the index value corresponds to a valid obfuscated value, and if not repeatedly combining the selection value and the key using a deterministic function to yield an additional selection value until the additional selection value corresponds to a valid obfuscated value. 15. The method of claim 14 , wherein a valid obfuscated value consists of a predetermined number of digits. 16. The method of claim 1 , wherein replacing the original values in the given field with the generated obfuscated values in records of different ones of the multiple sets of records occurs in parallel using different computing resources. 17. The method of claim 1 , wherein at least a first record that includes an obfuscated value in the collection of obfuscated data includes at least one original value that was not replaced with an obfuscated value. 18. The method of claim 1 , further including determining whether an original value in the first record is to be replaced with an obfuscated value using the key value based on whether the original value is to be replaced with the same obfuscated value consistently for multiple records in which the original value occurs. 19. A system for obfuscating data, the system including: a data storage system that stores records having values in one or more fields; and one or more processors coupled to the data storage system providing an execution environment to: store a key value for a set of the records from the data storage system the set of records being partitioned into multiple sets of records based on a given field of records in the set of records; the system is configured to: read an original value in the given field of a given one of the multiple sets of records, generate an index value by applying the same stored key value and the read original value from that field to a deterministic mathematical function to produce the index value; apply the index value to a lookup table of predetermined obfuscation values to retrieve from the lookup table an obfuscation value; replace the original value in that field of the record with the obfuscation value to obfuscate the original value of that field; and store the given one of the multiple sets of records having remaining original values in fields and obfuscation values in those given fields in the data storage system as an obfuscated set of records. 20. A non-transitory computer-readable medium storing a computer program for obfuscating data, the computer program including instructions, when executed by a computer, causes the computer to: store a key value for a set of records from a source of records the set of records being partitioned into multiple sets of records based on a given field of records in the set of records; and process each record of a given one of the multiple sets of records, and for each given field in the record, of the given one of the multiple sets of records, which record has a value being obfuscated, by instructions to: read an original value in the given field of the given record, generate an index value by applying the same stored key value and the read original value from that given field to a deterministic mathematical function to produce the index value; apply the index value to a lookup table of predetermined obfuscation values to retrieve from the lookup table an obfuscation value; replace the original value in that given field of the reco

Assignees

Inventors

Classifications

  • Protecting data · CPC title

  • by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title

  • Just-in-time application of countermeasures, e.g., on-the-fly decryption, just-in-time obfuscation or de-obfuscation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10102398B2 cover?
A method for obfuscating data includes: reading values occurring in one or more fields of multiple records from a data source; storing a key value; for each of multiple of the records, generating an obfuscated value to replace an original value in a given field of the record using the key value such that the obfuscated value depends on the key value and is deterministically related to the origi…
Who is the assignee on this patent?
Neergaard Peter, Ab Initio Technology Llc
What technology area does this patent fall under?
Primary CPC classification G06F21/6254. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 16 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).