Data generation
US-2015169428-A1 · Jun 18, 2015 · US
US10102398B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10102398-B2 |
| Application number | US-49735409-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 2, 2009 |
| Priority date | Jun 1, 2009 |
| Publication date | Oct 16, 2018 |
| Grant date | Oct 16, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for obfuscating data includes: reading values occurring in one or more fields of multiple records from a data source; storing a key value; for each of multiple of the records, generating an obfuscated value to replace an original value in a given field of the record using the key value such that the obfuscated value depends on the key value and is deterministically related to the original value; and storing the collection of obfuscated data including records that include obfuscated values in a data storage system.
Opening claim text (preview).
What is claimed is: 1. A computer implemented method for obfuscating data in records of a data source, the method including: storing, by one or more data processors, a key value for a set of records from a source of the set of records, the set of records being partitioned into multiple sets of records based on a given field of records in the set of records; reading an original value in the given field of the record from a given one of the multiple sets of records, generating, by one or more data processors, an index value by applying the same stored key value and the read original value from that field to a deterministic mathematical function to produce the index value; applying the index value to a lookup table of predetermined obfuscation values to retrieve from the lookup table an obfuscation value; and replacing the original value in that given field of the record with the obfuscation value to obfuscate the original value of that given field; and storing, by one or more data processors, the given one of the multiple sets of records having remaining original values in fields and obfuscation values in those given fields in a data storage system, as an obfuscated set of records. 2. The method of claim 1 , wherein the one or more data processors are plural data processors further includes: processing the multiple sets of records by the plural data processors according to a dataflow graph that provides reformat components that receive as input fields of the given record and the stored key value, and the original value in the given field of the record with the obfuscation value and outputs the reformatted record at its output port, and with the plural processors assigned to different ones of the sets of multiple records using parallel processing to execute the reformat components according to the dataflow graph. 3. The method of claim 1 wherein the deterministic mathematical function includes at least one of a cryptographic hash function that generates the index value, a non-cryptographic function that generates a selection value that is used as an input to a cryptographic hash function to provide the index value, a non-cryptographic function that generates a selection value that is used as an input to a cryptographic hash function to provide the index value or combining the original value and the same key using a cryptographic hash function to yield the index value. 4. The method of claim 2 , wherein multiple index values within a range correspond to the same obfuscated value in the predetermined set of obfuscated values. 5. The method of claim 4 , further including storing profile information including statistics characterizing values of at least one of the fields, wherein the size of the range is based on the statistics in the stored profile information characterizing values of the given field. 6. The method of claim 1 , wherein the deterministic function produces an intermediate selection value that is used to provide the index value, and is a cryptographic hash function. 7. The method of claim 6 , wherein the selection value is mapped to the obfuscated value using a deterministic mapping. 8. The method of claim 6 , wherein a domain of values from which the obfuscated value is selected includes multiple of the original values in the given field of the records from the data source. 9. The method of claim 8 , wherein one or more of the original values are not included in the domain of values. 10. The method of claim 9 , wherein one or more of the values in the domain of values are not included in the original values. 11. The method of claim 1 , wherein the cryptographic hash function prevents recovery of the original value from the obfuscated value using the key. 12. The method of claim 1 , wherein the key is provided from different sequences of selection values. 13. The method of claim 12 , wherein a first sequence of selection values for consecutive original values for a first value of the key is not predictable from a second sequence of selection values for consecutive original values for a second value of the key. 14. The method of claim 12 , further includes: determining whether the index value corresponds to a valid obfuscated value, and if not repeatedly combining the selection value and the key using a deterministic function to yield an additional selection value until the additional selection value corresponds to a valid obfuscated value. 15. The method of claim 14 , wherein a valid obfuscated value consists of a predetermined number of digits. 16. The method of claim 1 , wherein replacing the original values in the given field with the generated obfuscated values in records of different ones of the multiple sets of records occurs in parallel using different computing resources. 17. The method of claim 1 , wherein at least a first record that includes an obfuscated value in the collection of obfuscated data includes at least one original value that was not replaced with an obfuscated value. 18. The method of claim 1 , further including determining whether an original value in the first record is to be replaced with an obfuscated value using the key value based on whether the original value is to be replaced with the same obfuscated value consistently for multiple records in which the original value occurs. 19. A system for obfuscating data, the system including: a data storage system that stores records having values in one or more fields; and one or more processors coupled to the data storage system providing an execution environment to: store a key value for a set of the records from the data storage system the set of records being partitioned into multiple sets of records based on a given field of records in the set of records; the system is configured to: read an original value in the given field of a given one of the multiple sets of records, generate an index value by applying the same stored key value and the read original value from that field to a deterministic mathematical function to produce the index value; apply the index value to a lookup table of predetermined obfuscation values to retrieve from the lookup table an obfuscation value; replace the original value in that field of the record with the obfuscation value to obfuscate the original value of that field; and store the given one of the multiple sets of records having remaining original values in fields and obfuscation values in those given fields in the data storage system as an obfuscated set of records. 20. A non-transitory computer-readable medium storing a computer program for obfuscating data, the computer program including instructions, when executed by a computer, causes the computer to: store a key value for a set of records from a source of records the set of records being partitioned into multiple sets of records based on a given field of records in the set of records; and process each record of a given one of the multiple sets of records, and for each given field in the record, of the given one of the multiple sets of records, which record has a value being obfuscated, by instructions to: read an original value in the given field of the given record, generate an index value by applying the same stored key value and the read original value from that given field to a deterministic mathematical function to produce the index value; apply the index value to a lookup table of predetermined obfuscation values to retrieve from the lookup table an obfuscation value; replace the original value in that given field of the reco
Protecting data · CPC title
by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title
Just-in-time application of countermeasures, e.g., on-the-fly decryption, just-in-time obfuscation or de-obfuscation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.