Compact fuzzy private matching using a fully-homomorphic encryption scheme

US9749128B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9749128-B2
Application numberUS-201414278570-A
CountryUS
Kind codeB2
Filing dateMay 15, 2014
Priority dateMay 15, 2014
Publication dateAug 29, 2017
Grant dateAug 29, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for data matching includes providing two sets of encrypted data elements by converting data elements to respective sets of vectors and encrypting each vector with a public key of a homomorphic encryption scheme. Each data element includes a sequence of characters drawn from an alphabet. For pairs of encrypted data elements, a comparison measure is computed between the sets of encrypted vectors. An obfuscated vector is generated for each encrypted data element in the first set, which renders the first encrypted data element indecipherable when the comparison measure does not meet a threshold for at least one of the pairs of data encrypted elements comprising that encrypted data element. The obfuscated vectors can be decrypted with a private key, allowing data elements in the first set to be deciphered if the comparison measure meets the threshold for at least one of the data elements in the second set.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for data matching, comprising: encrypting a first set of data elements to generate a first set of encrypted data elements, each of the encrypted data elements in the first set being formed by converting a respective one of the first set of data elements to a set of vectors and encrypting each vector with a public key of a fully homomorphic encryption scheme, each data element in the first set including a sequence of at least two characters drawn from an alphabet, each vector including vector elements, one vector element for each character in the data element; receiving a second set of encrypted data elements, each of the encrypted data elements in the second set having been formed by converting a respective one of a second set of data elements to a set of vectors and encrypting each vector with the public key, each data element in the second set including a sequence of at least two characters drawn from the alphabet, each data element in the first set having a same number of characters as each data element in the second set; for each of a plurality of pairs of encrypted data elements, each pair comprising an encrypted data element from the first set and an encrypted data elements second set, computing a comparison measure between the encrypted vectors of the encrypted data element in the second set and the encrypted vectors of the encrypted data element in the first set, wherein the comparison measure is based on a Hamming distance, the fully homomorphic encryption scheme using right rotation operations on encrypted similarity measure vectors to compute the Hamming distance; for each encrypted data element in the first set, generating an obfuscated vector which renders the first encrypted data element indecipherable when the comparison measure does not meet a threshold for at least one of the pairs of data encrypted elements comprising that encrypted data element; and outputting the obfuscated vectors, whereby when the obfuscated vectors are decrypted with a private key of the fully homomorphic encryption scheme, only those data elements in the first set for which the comparison measure meets the threshold for at least one of the data elements in the second set are decipherable, the threshold of the comparison measure being set such those data elements in the first set which are t-fuzzy to elements of the second dataset are decipherable, where t is less than the number of characters in each data element; wherein at least one of the computing of the comparison measures and generating of the obfuscated vectors is performed with a computer processor. 2. The method of claim 1 , wherein the fully homomorphic encryption scheme provides for addition, multiplication and optionally right rotation operations to be performed on encrypted data. 3. The method of claim 1 , wherein the fully homomorphic encryption scheme is based on a Brakerski-Gentry-Vaikuntanathan (BGV) encryption scheme. 4. The method of claim 1 , wherein each data element in the first set includes at least three characters. 5. The method of claim 1 , wherein for each data element in the first and second sets, the set of encrypted vectors consists of a same number of encrypted vectors. 6. The method of claim 5 , wherein the number of encrypted vectors corresponds to a number of the characters in the alphabet. 7. The method of claim 1 , wherein the conversion of each data element to a set of vectors comprises, for each character in the alphabet, generating a binary vector by comparing the character of the alphabet with each character in the sequence of characters and setting a corresponding vector element to a first binary value if there is a match, otherwise setting the corresponding vector element to a second binary value. 8. The method of claim 1 , wherein for each data element in the first and second sets, each vector encrypted with the public key includes a same number of vector elements. 9. The method of claim 1 , wherein the computing of the comparison measure comprises computing the encrypted similarity vector according to: { H j i } K ={Σ D δ=1 r i δ r′ j δ } K   (2) where D represents the number of encrypted vectors for each data element and each i represents a data element in the second set, each j represents a data element in the first set, r i δ and r′ j δ represent respective vectors of two data elements being compared, and K indicates encryption with the public key. 10. The method of claim 9 , wherein the computing of the comparison measure further comprises computing an encrypted distance vector according to: {Δ H ( X i ,Y j )} K ={ T} K −{Σ w=1 T rot( H i j ,w )} k   (3) where {T} K is an encrypted vector in which all elements have a value of T, where T is a number of characters in each data element, and rot represents a homomorphic rotation operator, indexed by w. 11. The method of claim 1 , wherein the generating of the obfuscated vectors comprises computing, for each of the encrypted data elements in the first set, a sum of the encrypted data element and a product which generates a null value when the comparison measure meets the threshold for at least one of the data elements in the second set and a random number when the comparison measure does not meet the threshold for at least one of the data elements in the second set. 12. The method of claim 11 , wherein the comparison measure comprises an encrypted distance vector in which each element of the vector is a Hamming distance and the product is computed by subtracting an encrypted unitary vector from the encrypted distance vector, multiplying the result by a random number, and computing the product of the multiplied result over each of a set of encrypted unitary vectors, the encoded unitary vectors each representing a value of less than the threshold. 13. The method of claim 11 , wherein the generating of the obfuscated vectors comprises computing, for each of the encrypted data elements {Y j } K in the first set one of: {Π i=1 m Π w=0 T−t (Δ H ( X i ,Y j )− w )· r} K +{Y j } K   (4) where Δ H (X i , Y j ) is a vector which represents the computed comparison measure as a distance between one of the m data elements encrypted in the second set and the data element encrypted as {Y j } K in the first set, each w is a unitary vector from 0 up to T−t, where t is the threshold, T is the number of characters in each element, r is a random vector, and K indicates encryption with the public key; and {Π i=1 m Π w=t T (Δ H inv ( X i ,Y j )− w )· r} K +{Y j } K   (5) where Δ H inv (X i , Y j ) is a vector which represents the computed comparison measure as a similarity between one of the m data elements encrypted in the second set and the data element encrypted as {Y j } K in the first set. 14. The method of claim 1 wherein the first and second sets of encrypted data elements comprise encrypted license plate numbers. 15. The method of claim 1 , wherein for each data element, the encrypted vectors comprise a respective vector for each character in the alphabet and the computing of the comparison measure between the encrypted vectors of the encrypted data element in the second set and the encrypted vectors of the encrypted data element in the first set comprises multiplying a respective pair of the encrypted vectors for each character in the alphabet. 16. A method for data matching, comprising: encrypting a first set of data elements to generate a first set of encrypted data elements, each of the encrypted data elements in the first set being formed by convert

Assignees

Inventors

Classifications

  • Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy · CPC title

  • H04L9/008Primary

    involving homomorphic encryption · CPC title

  • where protection concerns the structure of data, e.g. records, types, queries · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9749128B2 cover?
A method for data matching includes providing two sets of encrypted data elements by converting data elements to respective sets of vectors and encrypting each vector with a public key of a homomorphic encryption scheme. Each data element includes a sequence of characters drawn from an alphabet. For pairs of encrypted data elements, a comparison measure is computed between the sets of encrypted…
Who is the assignee on this patent?
Xerox Corp
What technology area does this patent fall under?
Primary CPC classification H04L9/008. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Aug 29 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).