Guaranteeing anonymity of linked data graphs

US9477694B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9477694-B2
Application numberUS-201313965870-A
CountryUS
Kind codeB2
Filing dateAug 13, 2013
Priority dateApr 25, 2013
Publication dateOct 25, 2016
Grant dateOct 25, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and computer program product for transforming a Linked Data graph into a corresponding anonymous Linked Data graph, in which semantics is preserved and links can be followed to expand the anonymous graph up to r times without breaching anonymity (i.e., anonymity under r-dereferenceability). Anonymizing a Linked Data graph under r-dereferenceability provides privacy guarantees of k-anonymity or l-diversity variants, while taking into account and preserving the rich semantics of the graph.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer program product to guarantee anonymity under r-dereferenceability in a Linked Data graph, the computer program product comprising: a storage medium, wherein said storage medium is not a propagating signal, said storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising: transforming an original Linked Data graph structure having labelled nodes interconnected by directed edges into a corresponding anonymous Linked Data graph, with one or more nodes embodying a searchable Uniform Resource Identifier (URI), and updating the corresponding ontology definitions of the Linked Data graph based on the applied transformations; iteratively expanding said corresponding anonymous Linked Data graph up to r times, where r is an integer >0, wherein said iteratively expanding comprises: dereferencing a searchable URI of a node of said anonymized Linked Data graph structure by following a link to a resource from which a further Linked Data graph structure is obtained, said further Linked Data graph structure having additional labeled nodes embodying additional searchable URIs and property values, and replacing the node embodying the searchable URI of the anonymized Linked Data graph structure with the further Linked Data graph structure to obtain an expanded Linked Data graph, and updating the corresponding ontology definitions of the expanded Linked Data graph to include the ontology definitions of the further Linked Data graph structure; and determining from each said additional URIs and property values in said expanded corresponding anonymous Linked Data graph whether anonymity is breached by searching for a URI whose iterative dereferencing originates a graph containing a Linked Data graph node URI that was present in the original Linked Data graph, and in that case making the URI determined as breaching said anonymity non-dereferenceable. 2. The computer program product as claimed in claim 1 , wherein prior to said transforming, the method further comprises: identifying all inferences or indirect relations that can be extracted from the graph nodes, and the graph nodes included in the graph through dereferencing. 3. The computer program product as claimed in claim 2 , wherein said transforming comprises: computing instances I of nodes in said Linked Data graph structure, potentially having a direct identifier to be anonymized, and quasi-identifying properties Q of nodes whose values are to be anonymized; anonymizing all direct identifiers of instances I, changing the value of quasi-identifying properties in Q, and changing corresponding ontology definitions of said properties and instance identifiers, wherein said changing values in Q comprises: computing equivalence classes E for quasi-identifying properties in Q; and changing values of such properties to be anonymized based on said computed equivalence classes. 4. The computer program product as claimed in claim 1 , wherein said method further comprises: identifying from said original Linked Data graph structure, one or more instances I to protect, said identified one or more instances I comprising: a first set of said instances relating nodes of a given semantic class C that includes equivalent instances, instances of equivalent classes and instances whose inferred type is the given semantic class C, or any equivalent class; and, a second set of instances that are connected through an inverse functional property to any instance in said instances first set; and extracting said instances I to be protected. 5. The computer program product as claimed in claim 1 , further comprising: identifying, by said computer system, from said original Linked Data graph structure, one or more properties Q to collectively protect, said identified one or more properties Q comprising: properties that are inferred to be equivalent to any property given in an input set of properties P of said original Linked Data graph structure. 6. The computer program product as claimed in claim 5 , wherein said protecting one or more properties Q comprises: for each instance i of a given semantic class C: compute a set Sim i of a plurality of at least k−1 other instances of said semantic class C which are similar to instance i according to a similarity measure S, said identifying of properties Q further considering a semantic class C and properties in said set P and semantically-equivalent instances and properties computed through inference, wherein said transforming comprises one of: assigning the same generalized value to each property in set P for each instance in said set Sim i (produced equivalence class), or suppressing a property for all instances in said set Sim i , wherein said corresponding anonymous Linked Data graph exhibits k-anonymity. 7. The computer program product as claimed in claim 5 , wherein said protecting one or more properties Q comprises: for each instance i of a given semantic class C: compute a set Sim i of a plurality of at least k−1 other instances of said semantic class C which are similar to instance i according to a specified similarity measure S, said identifying of properties Q further considering a semantic class C and properties in said set P and semantically-equivalent instances and properties computed through inference, wherein said transforming comprises one of: selecting the instances in Sim i that results in at least l well represented values of related instances based on said specified similarity measure S, wherein said corresponding anonymous Linked Data graph exhibits k-diversity. 8. The computer program product as claimed in claim 1 , wherein if said anonymity is breached, said dereferencing comprises: computing a subset of URIs in said expanded corresponding anonymized Linked Data graph whose dereferencing breaches anonymity, and for each URI u in said subset, removing from said expanded corresponding anonymized Linked Data graph the Linked Data obtained by dereferencing said URI u; and determining if URI u belongs to the corresponding anonymous Linked Data graph, and substituting URI u with a non-dereferenceable URI if determined that the dereferencing u belongs to the transformed corresponding anonymous Linked Data graph. 9. A system to guarantee anonymity under r-dereferenceability in a Linked Data graph comprising: a memory storage device; a processor unit in communication with said memory storage device and configured to perform a method to: transform an original Linked Data graph structure having labeled nodes interconnected by directed edges into a corresponding anonymous Linked Data graph, with one or more nodes embodying a searchable Uniform Resource Identifier (URI), and updating the corresponding ontology definitions of the Linked Data graph based on the applied transformations; iteratively expand said corresponding anonymous Linked Data graph up to r times, where r is an integer >0, wherein said to iteratively expand, the processor unit is further configured to: dereference a searchable URI of a node of said anonymized Linked Data graph structure by following a link to a resource from which a further Linked Data graph structure is obtained, said further Linked Data graph structure having additional labeled nodes embodying additional searchable URIs and property values, and replace the node embodying the searchable URI of the anonymized Linked Data graph structure with the further Linked Data graph structure to obtain an expanded Linked Data graph, and update the corresponding ontology definitions of the expanded Linked Data graph to include the ontology definitions of the further Linked Data graph stru

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title

  • Protecting data · CPC title

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

  • G06F16/21Primary

    Design, administration or maintenance of databases · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9477694B2 cover?
A system and computer program product for transforming a Linked Data graph into a corresponding anonymous Linked Data graph, in which semantics is preserved and links can be followed to expand the anonymous graph up to r times without breaching anonymity (i.e., anonymity under r-dereferenceability). Anonymizing a Linked Data graph under r-dereferenceability provides privacy guarantees of k-anon…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F17/30289. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 25 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).