Watermarking anonymized datasets by adding decoys

US10997279B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10997279-B2
Application numberUS-201815859950-A
CountryUS
Kind codeB2
Filing dateJan 2, 2018
Priority dateJan 2, 2018
Publication dateMay 4, 2021
Grant dateMay 4, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments for watermarking anonymized datasets using decoys in a computing environment are provided. One or more decoy records may be embedded in an anonymized dataset such that a re-identification attack on the anonymized dataset targets the one or more decoy records.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method, implemented by a one or more hardware processors, for watermarking anonymized datasets using decoys in a computing environment, comprising: embedding by the one or more hardware processors one or more decoy records in an anonymized dataset such that a re-identification attack on the anonymized dataset to reverse effects of a masking operation resulting in the anonymization targets the one or more decoy records, wherein the one or more decoy records are embedded in the anonymized dataset according to a defined risk threshold representing a probability that the one or more decoy records will be subject to the re-identification attack such that user information used to create the decoy records is chosen based on the user information being known to be desirable based on the user information being subject to previous re-identification attacks. 2. The method of claim 1 , further including creating the one or more decoy records from datasets unrelated to the anonymized dataset. 3. The method of claim 1 , further including creating the one or more decoy records from a population dataset linkable with the anonymized dataset. 4. The method of claim 1 , further including defining the one or more decoy records to have risk of re-identification equal to or greater than the defined risk threshold. 5. The method of claim 1 , further including defining the one or more decoy records to be unique to each third-party recipient of the anonymized dataset. 6. The method of claim 1 , further including tracing the re-identification attack using the one or more decoy records to a third-party recipient of the anonymized dataset, wherein the one or more decoy records are unique to the third-party. 7. The method of claim 1 , further including using the one or more decoy records to identify a third-party executing the re-identification attack. 8. A system for watermarking anonymized datasets using decoys in a computing environment, comprising: one or more computers with executable instructions that when executed cause the system to: embed one or more decoy records in an anonymized dataset such that a re-identification attack on the anonymized dataset to reverse effects of a masking operation resulting in the anonymization targets the one or more decoy records, wherein the one or more decoy records are embedded in the anonymized dataset according to a defined risk threshold representing a probability that the one or more decoy records will be subject to the re-identification attack such that user information used to create the decoy records is chosen based on the user information being known to be desirable based on the user information being subject to previous re-identification attacks. 9. The system of claim 8 , wherein the executable instructions create the one or more decoy records from datasets unrelated to the anonymized dataset. 10. The system of claim 8 , wherein the executable instructions create the one or more decoy records from a population dataset linkable with the anonymized dataset. 11. The system of claim 8 , wherein the executable instructions define the one or more decoy records to have risk of re-identification equal to or greater than the defined risk threshold. 12. The system of claim 8 , wherein the executable instructions define the one or more decoy records to be unique to each third-party recipient of the anonymized dataset. 13. The system of claim 8 , wherein the executable instructions trace the re-identification attack using the one or more decoy records to a third-party recipient of the anonymized dataset, wherein the one or more decoy records are unique to the third-party. 14. The system of claim 8 , wherein the executable instructions use the one or more decoy records to identify a third-party executing the re-identification attack. 15. A computer program product for, by a processor, watermarking anonymized datasets using decoys in a computing environment, the computer program product comprising a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising: an executable portion that embeds one or more decoy records in an anonymized dataset such that a re-identification attack on the anonymized dataset to reverse effects of a masking operation resulting in the anonymization targets the one or more decoy records, wherein the one or more decoy records are embedded in the anonymized dataset according to a defined risk threshold representing a probability that the one or more decoy records will be subject to the re-identification attack such that user information used to create the decoy records is chosen based on the user information being known to be desirable based on the user information being subject to previous re-identification attacks. 16. The computer program product of claim 15 , further including an executable portion that creates the one or more decoy records from datasets unrelated to the anonymized dataset. 17. The computer program product of claim 15 , further including an executable portion that creates the one or more decoy records from a population dataset linkable with the anonymized dataset. 18. The computer program product of claim 15 , further including an executable portion that: defines the one or more decoy records to have risk of re-identification equal to or greater than the defined risk threshold; and defines the one or more decoy records to be unique to each third-party recipient of the anonymized dataset. 19. The computer program product of claim 15 , further including an executable portion that traces the re-identification attack using the one or more decoy records to a third-party recipient of the anonymized dataset, wherein the one or more decoy records are unique to the third-party. 20. The computer program product of claim 15 , further including an executable portion that uses the one or more decoy records to identify a third-party executing the re-identification attack.

Assignees

Inventors

Classifications

  • using deception as countermeasure, e.g. honeypots, honeynets, decoys or entrapment · CPC title

  • Event detection, e.g. attack signature detection · CPC title

  • G06F21/16Primary

    Program or content traceability, e.g. by watermarking · CPC title

  • involving long-term monitoring or reporting · CPC title

  • by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10997279B2 cover?
Embodiments for watermarking anonymized datasets using decoys in a computing environment are provided. One or more decoy records may be embedded in an anonymized dataset such that a re-identification attack on the anonymized dataset targets the one or more decoy records.
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification H04L63/1491. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue May 04 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).