Privacy preserving data querying
US-9202079-B2 · Dec 1, 2015 · US
US9749128B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9749128-B2 |
| Application number | US-201414278570-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 15, 2014 |
| Priority date | May 15, 2014 |
| Publication date | Aug 29, 2017 |
| Grant date | Aug 29, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for data matching includes providing two sets of encrypted data elements by converting data elements to respective sets of vectors and encrypting each vector with a public key of a homomorphic encryption scheme. Each data element includes a sequence of characters drawn from an alphabet. For pairs of encrypted data elements, a comparison measure is computed between the sets of encrypted vectors. An obfuscated vector is generated for each encrypted data element in the first set, which renders the first encrypted data element indecipherable when the comparison measure does not meet a threshold for at least one of the pairs of data encrypted elements comprising that encrypted data element. The obfuscated vectors can be decrypted with a private key, allowing data elements in the first set to be deciphered if the comparison measure meets the threshold for at least one of the data elements in the second set.
Opening claim text (preview).
What is claimed is: 1. A method for data matching, comprising: encrypting a first set of data elements to generate a first set of encrypted data elements, each of the encrypted data elements in the first set being formed by converting a respective one of the first set of data elements to a set of vectors and encrypting each vector with a public key of a fully homomorphic encryption scheme, each data element in the first set including a sequence of at least two characters drawn from an alphabet, each vector including vector elements, one vector element for each character in the data element; receiving a second set of encrypted data elements, each of the encrypted data elements in the second set having been formed by converting a respective one of a second set of data elements to a set of vectors and encrypting each vector with the public key, each data element in the second set including a sequence of at least two characters drawn from the alphabet, each data element in the first set having a same number of characters as each data element in the second set; for each of a plurality of pairs of encrypted data elements, each pair comprising an encrypted data element from the first set and an encrypted data elements second set, computing a comparison measure between the encrypted vectors of the encrypted data element in the second set and the encrypted vectors of the encrypted data element in the first set, wherein the comparison measure is based on a Hamming distance, the fully homomorphic encryption scheme using right rotation operations on encrypted similarity measure vectors to compute the Hamming distance; for each encrypted data element in the first set, generating an obfuscated vector which renders the first encrypted data element indecipherable when the comparison measure does not meet a threshold for at least one of the pairs of data encrypted elements comprising that encrypted data element; and outputting the obfuscated vectors, whereby when the obfuscated vectors are decrypted with a private key of the fully homomorphic encryption scheme, only those data elements in the first set for which the comparison measure meets the threshold for at least one of the data elements in the second set are decipherable, the threshold of the comparison measure being set such those data elements in the first set which are t-fuzzy to elements of the second dataset are decipherable, where t is less than the number of characters in each data element; wherein at least one of the computing of the comparison measures and generating of the obfuscated vectors is performed with a computer processor. 2. The method of claim 1 , wherein the fully homomorphic encryption scheme provides for addition, multiplication and optionally right rotation operations to be performed on encrypted data. 3. The method of claim 1 , wherein the fully homomorphic encryption scheme is based on a Brakerski-Gentry-Vaikuntanathan (BGV) encryption scheme. 4. The method of claim 1 , wherein each data element in the first set includes at least three characters. 5. The method of claim 1 , wherein for each data element in the first and second sets, the set of encrypted vectors consists of a same number of encrypted vectors. 6. The method of claim 5 , wherein the number of encrypted vectors corresponds to a number of the characters in the alphabet. 7. The method of claim 1 , wherein the conversion of each data element to a set of vectors comprises, for each character in the alphabet, generating a binary vector by comparing the character of the alphabet with each character in the sequence of characters and setting a corresponding vector element to a first binary value if there is a match, otherwise setting the corresponding vector element to a second binary value. 8. The method of claim 1 , wherein for each data element in the first and second sets, each vector encrypted with the public key includes a same number of vector elements. 9. The method of claim 1 , wherein the computing of the comparison measure comprises computing the encrypted similarity vector according to: { H j i } K ={Σ D δ=1 r i δ r′ j δ } K (2) where D represents the number of encrypted vectors for each data element and each i represents a data element in the second set, each j represents a data element in the first set, r i δ and r′ j δ represent respective vectors of two data elements being compared, and K indicates encryption with the public key. 10. The method of claim 9 , wherein the computing of the comparison measure further comprises computing an encrypted distance vector according to: {Δ H ( X i ,Y j )} K ={ T} K −{Σ w=1 T rot( H i j ,w )} k (3) where {T} K is an encrypted vector in which all elements have a value of T, where T is a number of characters in each data element, and rot represents a homomorphic rotation operator, indexed by w. 11. The method of claim 1 , wherein the generating of the obfuscated vectors comprises computing, for each of the encrypted data elements in the first set, a sum of the encrypted data element and a product which generates a null value when the comparison measure meets the threshold for at least one of the data elements in the second set and a random number when the comparison measure does not meet the threshold for at least one of the data elements in the second set. 12. The method of claim 11 , wherein the comparison measure comprises an encrypted distance vector in which each element of the vector is a Hamming distance and the product is computed by subtracting an encrypted unitary vector from the encrypted distance vector, multiplying the result by a random number, and computing the product of the multiplied result over each of a set of encrypted unitary vectors, the encoded unitary vectors each representing a value of less than the threshold. 13. The method of claim 11 , wherein the generating of the obfuscated vectors comprises computing, for each of the encrypted data elements {Y j } K in the first set one of: {Π i=1 m Π w=0 T−t (Δ H ( X i ,Y j )− w )· r} K +{Y j } K (4) where Δ H (X i , Y j ) is a vector which represents the computed comparison measure as a distance between one of the m data elements encrypted in the second set and the data element encrypted as {Y j } K in the first set, each w is a unitary vector from 0 up to T−t, where t is the threshold, T is the number of characters in each element, r is a random vector, and K indicates encryption with the public key; and {Π i=1 m Π w=t T (Δ H inv ( X i ,Y j )− w )· r} K +{Y j } K (5) where Δ H inv (X i , Y j ) is a vector which represents the computed comparison measure as a similarity between one of the m data elements encrypted in the second set and the data element encrypted as {Y j } K in the first set. 14. The method of claim 1 wherein the first and second sets of encrypted data elements comprise encrypted license plate numbers. 15. The method of claim 1 , wherein for each data element, the encrypted vectors comprise a respective vector for each character in the alphabet and the computing of the comparison measure between the encrypted vectors of the encrypted data element in the second set and the encrypted vectors of the encrypted data element in the first set comprises multiplying a respective pair of the encrypted vectors for each character in the alphabet. 16. A method for data matching, comprising: encrypting a first set of data elements to generate a first set of encrypted data elements, each of the encrypted data elements in the first set being formed by convert
Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy · CPC title
involving homomorphic encryption · CPC title
where protection concerns the structure of data, e.g. records, types, queries · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.