Anonymization device, anonymization method and computer readable medium
US-2015033356-A1 · Jan 29, 2015 · US
US9870381B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9870381-B2 |
| Application number | US-201514719663-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 22, 2015 |
| Priority date | May 22, 2015 |
| Publication date | Jan 16, 2018 |
| Grant date | Jan 16, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Quasi-identifiers (QIDs) are detected in a dataset using a set of computing tasks. The dataset has a plurality of records and a set of attributes. An index is generated for the dataset. The index has an indicator for each attribute value of each record in the dataset. Each indicator specifies all the records in the dataset having the same value for the attribute. Each task is assigned an attribute combination and a subset of the plurality of records in the dataset and is passed to a thread for execution on computing resources. The executing task inspects the set of records specified by the index indicator for each attribute value in the attribute combination to produce a result. The result of at least one task identifies a unique record for the associated attribute combination. The attribute combination producing the unique record is a QID.
Opening claim text (preview).
What is claimed is: 1. A computer program product for detecting quasi-identifiers in a dataset using a set of computing tasks, the dataset having a plurality of records and further having a set of attributes, each record having an attribute value for each attribute in the set of attributes, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, the program instructions executable by a processor to perform a method comprising: generating a first index for the dataset, the first index having an index indicator for each attribute value of each record, each index indicator specifying a set of records, the specified set of records including each record in the plurality of records having the same attribute value for the associated attribute as the associated record; assigning an attribute combination to each task in the set of computing tasks, the attribute combination for each task including one or more attributes of the set of attributes; assigning a subset of the plurality of records to each task in the set of computing tasks; detecting at least one quasi-identifier by passing each task to at least one thread for execution on computing resources, the execution of each task comprising inspecting the index indicator for each attribute value in the assigned attribute combination of at least a portion of the assigned subset of the plurality of records to produce a result, the result of at least one task identifying a unique record for the associated attribute combination, the attribute values in the attribute combination for the unique record different from the attribute values in the attribute combination for all other records in the plurality of records, the at least one quasi-identifier being the attribute combination assigned to the at least one task identifying a unique record. 2. The computer program product of claim 1 , wherein the method further comprises: assigning a second attribute combination to each task in the set of computing tasks, the second attribute combination for each task including one or more attributes of the set of attributes, the second attribute combination for each task excluding the detected at least one quasi-identifier; detecting a second at least one quasi-identifier by second passing each task to the at least one thread for execution on the computing resources, the execution of each task comprising inspecting the index indicator for each attribute value in the assigned second attribute combination of at least a portion of the assigned subset of the plurality of records to produce a second result, the second result of at least one task identifying a unique record for the associated second attribute combination, the second at least one quasi-identifier being the second attribute combination assigned to the at least one task identifying a unique record. 3. The computer program product of claim 1 , wherein the detecting the at least one quasi-identifier includes detecting a first quasi-identifier by passing a first task to a first thread for execution on the computing resources, wherein the first quasi-identifier is detected before inspecting the index indicator for each attribute value in the assigned attribute combination of a last portion of the assigned subset of the plurality of records, and wherein the method further comprises: stopping the first thread upon detecting the first quasi-identifier, the stopping the first thread preventing inspecting the index indicator for each attribute value in the assigned attribute combination of the last portion of the assigned subset. 4. The computer program product of claim 1 , wherein each attribute in the set of attributes is represented by a set of distinct attribute values, and wherein the generating the first index for the data set comprises: generating a second index for each attribute in the set of attributes, each second index comprising a tree structure having a hierarchical set of nodes corresponding to the set of distinct attribute values representing the set of attributes, each node specifying a second set of records, the specified second set of records including each record in the plurality of records having the distinct attribute value corresponding to the node; and for each attribute value in the plurality of records, traversing the tree structure associated with the attribute to locate the node corresponding to the attribute value, and causing the index indicator for the attribute value in the first index to specify the second set of records specified by the located node. 5. A computer system for detecting quasi-identifiers in a dataset using a set of computing tasks, the dataset having a plurality of records and further having a set of attributes, each record having an attribute value for each attribute in the set of attributes, the computer system comprising a processor and a computer readable storage medium storing instructions, wherein the computer readable storage medium is not a transitory signal per se, and wherein the processor executes the instructions to perform a method comprising: generating a first index for the dataset, the first index having an index indicator for each attribute value of each record, each index indicator specifying a set of records, the specified set of records including each record in the plurality of records having the same attribute value for the associated attribute as the associated record; and assigning, using a main thread, an attribute combination to each task in the set of computing tasks, the attribute combination for each task including one or more attributes of the set of attributes, the main thread further used by the processor to assign a subset of the plurality of records to each task in the set of computing tasks, and the main thread further used by the processor to detect at least one quasi-identifier by passing each task to at least one thread for execution on computing resources, the execution of each task comprising inspecting the index indicator for each attribute value in the assigned attribute combination of at least a portion of the assigned subset of the plurality of records to produce a result, the result of at least one task identifying a unique record for the associated attribute combination, the attribute values in the attribute combination for the unique record different from the attribute values in the attribute combination for all other records in the plurality of records, the at least one quasi-identifier being the attribute combination assigned to the at least one task identifying a unique record. 6. The computer system of claim 5 , wherein the main thread is further used by the processor to assign, before the assigning the attribute combination to each task, a prior attribute combination to each task in the set of computing tasks, the prior attribute combination for each task including one or more attributes of the set of attributes, the prior attribute combination for each task different from all attribute combinations, and wherein the main thread is further used by the processor to detect, before the detecting the at least one quasi-identifier, no quasi-identifier by passing each task to the at least one thread for execution on the computing resources, the execution of each task comprising inspecting the index indicator for each attribute value in the assigned prior attribute combination of the assigned subset of the plurality of records to produce a result, the result of each task identifying no unique record for the associated attribute combination. 7. The computer system of claim 5 , wherein the main thread is further used by the processor to assign, before the assigning the subset of the
considering software capabilities, i.e. software resources associated or available to the machine · CPC title
Indexing structures · CPC title
to service a request · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.