Iterative execution of data de-identification processes
US-11036886-B2 · Jun 15, 2021 · US
US11314797B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11314797-B2 |
| Application number | US-201916706657-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 6, 2019 |
| Priority date | Nov 14, 2019 |
| Publication date | Apr 26, 2022 |
| Grant date | Apr 26, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A data de-identification apparatus and method are provided. The data de-identification apparatus stores a data set of a first industry, wherein the data set is defined with a plurality of fields. The data de-identification apparatus receives a first instruction and a second instruction, wherein the first instruction corresponds to a second industry and the second instruction corresponds to a use of data. The data de-identification apparatus determines an identification category for each of the fields according to the first industry, the second industry, and the use of data. The data de-identification apparatus transforms the data set into a transformed data set according to the use of data and then transforms the transformed data set into a de-identification data set according to the identification categories.
Opening claim text (preview).
What is claimed is: 1. A data de-identification apparatus, comprising: a storage, being configured to store a data set of a first industry, wherein the data set is defined with a plurality of fields; an input interface, being configured to receive a first instruction and a second instruction, wherein the first instruction corresponds to a second industry, and the second instruction corresponds to a use of data; and a processor, being electrically connected to the storage and the input interface, and being configured to determine an identification category of each of the fields according to the first industry, the second industry, and the use of data, transform the data set of the first industry into a first transformed data set according to the use of data, and transform the first transformed data set into a first de-identification data set according to the identification categories; wherein the processor further retrieves a plurality of feature values from the first de-identification data set, and the processor further estimates, by using the feature values, a classification accuracy of the first de-identification data set applied for the use of data. 2. The data de-identification apparatus of claim 1 , wherein the processor further determines a confidentiality category of each of the fields according to the first industry, the second industry, and the use of data, wherein the processor transforms the first transformed data set into the first de-identification data set according to the identification categories and the confidentiality categories. 3. The data de-identification apparatus of claim 1 , wherein the processor transforms the data set into the first transformed data set by the following operations: determining a data transformation method of a designated field according to the use of data and transforming a plurality of pieces of data corresponding to the designated field by the data transformation method. 4. The data de-identification apparatus of claim 1 , wherein the processor transforms the first transformed data set into the first de-identification data set by the following operations: determining a de-identification method for each of the fields according to the corresponding identification category and performing de-identification on a plurality of pieces of data of each of the fields in the first transformed data set according to the corresponding de-identification method. 5. The data de-identification apparatus of claim 1 , wherein the processor further performs a de-identification check on the first de-identification data set. 6. The data de-identification apparatus of claim 5 , wherein the de-identification check comprises at least one of a K-Anonymity check, an L-Diversity check, and a T-Closeness check. 7. The data de-identification apparatus of claim 5 , wherein the processor further determines an order of importance of the fields according to the use of data, wherein when the processor further determines that the first de-identification data set fails the de-identification check, the processor further determines at least one field comprised in the first de-identification data set according to the order of importance to perform an advanced de-identification. 8. The data de-identification apparatus of claim 1 , wherein the processor retrieves the plurality of feature values from the first de-identification data set by an autoencoder, and the processor further estimates the model performance on the first de-identification data set for the use of data by using the feature values. 9. The data de-identification apparatus of claim 8 , wherein when the model performance is lower than a threshold value, the processor further transforms the data set into a second transformed data set according to the use of data and transforms the second transformed data set into a second de-identification data set according to the identification categories, wherein the processor determines a data transformation method of a designated field in the fields according to the use of data, and the processor uses different data transformation methods to transform a plurality of pieces of data corresponding to the designated field when transforming the first transformed data set and the second transformed data set. 10. The data de-identification apparatus of claim 1 , further comprising: a transmission interface, being electrically connected to the processor and being configured to transmit the first de-identification data set to a model construction apparatus; wherein the model construction apparatus establishes an evaluation model corresponding to the use of data after receiving the first de-identification data set and a third de-identification data set corresponding to the second industry. 11. A data de-identification method, being adapted for use in an electronic computing apparatus, the electronic computing apparatus storing a data set of a first industry, the data set being defined with a plurality of fields, the data de-identification method comprising: (a) receiving a first instruction, wherein the first instruction corresponds to a second industry; (b) receiving a second instruction, wherein the second instruction corresponds to a use of data; (c) determining an identification category of each of the fields according to the first industry, the second industry, and the use of data; (d) transforming the data set of the first industry into a first transformed data set according to the use of data; and (e) transforming the first transformed data set into a first de-identification data set according to the identification categories; (f) retrieving a plurality of feature values from the first de-identification data set, and further estimating, by using the feature values, a classification accuracy of the first de-identification data set applied for the use of data. 12. The data de-identification method of claim 11 , further comprising: determining a confidentiality category of each of the fields according to the first industry, the second industry, and the use of data; wherein the step (e) transforms the first transformed data set into the first de-identification data set according to the identification categories and the confidentiality categories. 13. The data de-identification method of claim 11 , wherein the step (d) comprises: determining a data transformation method of a designated field according to the use of data; and transforming a plurality of pieces of data corresponding to the designated field by the data transformation method. 14. The data de-identification method of claim 11 , wherein the step (e) comprises: determining a de-identification method for each of the fields according to the corresponding identification category; and performing de-identification on a plurality of pieces of data of each of the fields in the first transformed data set according to the corresponding de-identification method. 15. The data de-identification method of claim 11 , further comprising: performing a de-identification check on the first de-identification data set. 16. The data de-identification method of claim 15 , wherein the de-identification check comprises at least one of a K-Anonymity check, an L-Diversity check, and a T-Closeness check. 17. The data de-identification method of claim 15 , further comprising: determining an order of importance of the fields according to the use of data; determining at least one field comprised in the first de-identification data set according to the order of importance to perform an advanced de-identification when the first de-iden
Clustering or classification · CPC title
by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title
Query processing · CPC title
Providing cryptographic facilities or services · CPC title
Filtering based on additional data, e.g. user or group profiles (filtering in web context G06F16/9535, G06F16/9536) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.