Anonymizing sensitive data in logic problems for input to a constraint solver

US11093641B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-11093641-B1
Application numberUS-201816219742-A
CountryUS
Kind codeB1
Filing dateDec 13, 2018
Priority dateDec 13, 2018
Publication dateAug 17, 2021
Grant dateAug 17, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A document anonymization system transforms structured documents, such as security policies, that contain user-specific and other sensitive data, producing encoded logic problems in the format or language of one or more constraint solvers; the logic problems do not contain any of the sensitive data. The system may perform a one- or two-stage anonymization process: in a first stage, the electronic document is analyzed according to its document type to identify parameters likely to contain sensitive data, and the associated values are replaced with arbitrary values; in a second stage, after the anonymized electronic document is converted into logic formulae representing the data, the system performs replacements of string constants in the logic formulae with arbitrary strings to further anonymize the sensitive data. The system may confirm that anonymization preserves the document structure, difficulty level, and satisfiability of the original document by executing the constraint solver against the anonymized logic problem.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising one or more processors and memory storing computer-executable instructions that, when executed by the one or more processors, cause the system to: receive information associated with a user account of a computing resource service provider, the information specifying a first security policy encoding, according to a document structure, a first set of security permissions for accessing a computing resource provided by the computing resource service provider and associated with the user account; determine, based on the document structure, a first parameter of a plurality of parameters identified by the first set of security permissions, and a first format applied to values of the first parameter, the first parameter being associated with private data associated with one or both of the user account and the computing resource; obtain from the first set of security permissions a first value of the first parameter; generate an anonymized value according to the first format; generate, according to the document structure, a second set of security permissions identifying the plurality of parameters and comprising one or more transformations of the private data into anonymized data, the one or more transformations including a replacement of the first value in association with the first parameter with the anonymized value; determine a first propositional logic expression in a solver language recognized by a first constraint solver based at least in part on the second set of security permissions and comprising: a plurality of string constants representing the corresponding values, stored in the second set of security permissions, of the plurality of parameters; and a plurality of constraints on the plurality of string constants; anonymize one or more of the plurality of string constants to produce an anonymized propositional logic expression representing the first security policy according to the document structure and in the solver language, wherein the private data contained in the first security policy cannot be determined from the anonymized propositional logic expression; and perform an action associated with including the anonymized propositional logic expression in a set of test instances delivered to a developer of the first constraint solver, the test instances representing anonymized logic problems associated with the document structure. 2. The system of claim 1 , wherein executing the instructions further causes the system to: determine, based on the document structure, a second parameter, of the plurality of parameters, that is associated with public data describing an operation of a service of the computing resource service provider; obtain from the first set of security permissions a second value of the second parameter; and to generate the second set of security permissions, include the second value in association with the second parameter in the second set of security permissions. 3. The system of claim 1 , wherein to anonymize one or more of the plurality of string constants, the instructions, when executed, cause the system to: identify a plurality of matches between various pairs of the plurality of string constants, each match indicating that at least part of one string constant of the corresponding pair appears in the other string constant of the corresponding pair; classify a first match of the plurality of matches as a first inclusion relation, of a plurality of inclusion relations, wherein a first string constant of the first match appears entirely in a second string constant of the first match; classify a second match of the plurality of matches as an overlap relation, wherein a third string constant of the second match includes a first substring, at a beginning or an end of the third string constant, that appears at a beginning or an end of a fourth string constant of the second match; add the first substring to the plurality of string constants; add to the plurality of matches: a third match between the first substring and the third string constant; and a fourth match between the first substring and the fourth string constant; classify the third and fourth matches as second and third inclusion relations, respectively, of the plurality of inclusion relations; and for each of the plurality of matches classified as one of the plurality of inclusion relations: identify in the corresponding pair of string constants an included string and a compound string that includes the included string and an additional substring; generate a first arbitrary string and a second arbitrary string; replace all appearances of the included string with the first arbitrary string; and replace all appearances of the additional substring with the second arbitrary string. 4. The system of claim 1 , wherein the instructions, when executed, cause the system to: determine a second propositional logic expression in the solver language based at least in part on the first set of security permissions, the second propositional logic expression representing the first security policy with non-anonymized data including the private data; and to perform the action: cause a first instance of the first constraint solver to execute against the first propositional logic expression and, after a first execution duration, produce a first output representing a satisfiability of the first propositional logic expression; cause the first instance or a second instance of the first constrain solver to execute against the second propositional logic expression and, after a second execution duration, produce a second output representing a satisfiability of the second propositional logic expression; and determine that the first and second propositional logic expressions have the same satisfiability and approximately equal difficulty represented by the first and second execution durations. 5. A system, comprising one or more processors and memory storing computer-executable instructions that, when executed by the one or more processors, cause the system to: receive information associated with a user account of a computing resource service provider, the information specifying an electronic document of a first document type; determine a first set of constraints on a plurality of parameters defined by the first document type, the electronic document encoding the plurality of parameters and a plurality of values each associated with a corresponding parameter of the plurality of parameters; determine that the plurality of values include sensitive data; anonymize the sensitive data in one or both of the electronic document and the first set of constraints to produce a second set of constraints on the plurality of parameters, the second set of constraints comprising anonymized data that obscures the sensitive data of the plurality of values; determine whether the second set of constraints accurately represents the electronic document; and responsive to a determination that the second set of constraints accurately represents the electronic document, perform an action associated with including the second set of constraints in a first test instance of a set of test instances delivered to a developer of a first constraint solver, the test instances representing anonymized logic problems associated with the first document type. 6. The system of claim 5 , wherein to anonymize the sensitive data, executing the instructions further causes the system to: identify structured data in the electronic document, the structured data comprising the sensitive data; determine a document structure associated with the first document type and defining the structured data and one or more data dependencies associated with various ones of the plurality of values; determine an anony

Assignees

Inventors

Classifications

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Machine learning · CPC title

  • Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text · CPC title

  • for managing network security; network security policies in general (filtering policies H04L63/0227) · CPC title

  • Anonymous communication, i.e. the party's identifiers are hidden from the other party or parties, e.g. using an anonymizer · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11093641B1 cover?
A document anonymization system transforms structured documents, such as security policies, that contain user-specific and other sensitive data, producing encoded logic problems in the format or language of one or more constraint solvers; the logic problems do not contain any of the sensitive data. The system may perform a one- or two-stage anonymization process: in a first stage, the electroni…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/6254. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 17 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).