Encrypted data deduplication in cloud storage
US-2017177899-A1 · Jun 22, 2017 · US
US10942906B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10942906-B2 |
| Application number | US-201816026819-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 3, 2018 |
| Priority date | May 31, 2018 |
| Publication date | Mar 9, 2021 |
| Grant date | Mar 9, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed herein are system, method, and computer program product embodiments for detecting duplicates with exact and fuzzy matching on encrypted match indexes using an encryption key in a cloud computing platform. An embodiment operates by determining a match rule index value upon reception of a new record. The embodiment encrypts the match index rule value using the customer's encryption key and a deterministic encryption method and stores the encrypted match rule index value. Duplicate detection may be later performed by using the same deterministic encryption method to determine a cypher text for a candidate entry and comparing the ciphertext to the stored encrypted match indexes.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: receiving, by one or more processors, a new record added to a cloud computing platform; selecting, by the one or more processors, a match rule applicable to the new record comprising a unique identifier, a match type, and one or more fields of applicability, wherein the match rule defines a duplicate record when performing duplicate detection on the new record; calculating, by the one or more processors, an encrypted match index based on the one or more fields of applicability, wherein the unique identifier is used as an initialization vector in an encryption scheme used to create the encrypted match index; comparing, by the one or more processors, the encrypted match index to an encrypted match index column to determine if the encrypted match index duplicates a previously generated and stored encrypted match index in the encrypted match index column; and when the encrypted match index does not duplicate the previously generated and stored encrypted match index in the encrypted match index column, storing, by the one or more processors, the new record in the cloud computing platform and adding the encrypted match index to the encrypted match index column. 2. The method of claim 1 , further comprising: receiving, by the one or more processors, match rule parameters comprising the match type and the one or more fields of applicability; creating, by the one or more processors, the unique identifier based on the one or more fields of applicability; and storing, by the one or more processors, a custom match rule comprising the unique identifier, the match type, and the one or more fields of applicability. 3. The method of claim 1 , further comprising: displaying, by the one or more processors, an error message when the encrypted match index duplicates the previously generated and stored encrypted match index in the encrypted match index column. 4. The method of claim 1 , further comprising: storing, by the one or more processors, the encrypted match index in an unencrypted form if encryption is not enabled on any of the one or more fields of applicability. 5. The method of claim 1 , further comprising: scanning, by the one or more processors, the encrypted match index column to determine one or more duplicates; and displaying, by the one or more processors, the one or more duplicates in a web interface. 6. The method of claim 1 , wherein the match type can be exact or fuzzy. 7. The method of claim 1 , wherein the encryption scheme is a deterministic scheme. 8. The method of claim 1 , wherein the cloud computing platform is a customer relationship management platform. 9. A system, comprising: a memory; and at least one processor coupled to the memory and configured to: receive a new record added to a cloud computing platform; select a match rule applicable to the new record comprising a unique identifier, a match type, and one or more fields of applicability, wherein the match rule defines a duplicate record when performing duplicate detection on the new record; calculate an encrypted match index based on the one or more fields of applicability, wherein the unique identifier is used as an initialization vector in an encryption scheme used to create the encrypted match index; compare the encrypted match index to an encrypted match index column to determine if the encrypted match index duplicates a previously generated and stored encrypted match index in the encrypted match index column; and when the encrypted match index does not duplicate the previously generated and stored encrypted match index in the encrypted match index column, store the new record in the cloud computing platform and add the encrypted match index to the encrypted match index column. 10. The system of claim 9 , the at least one processor further configured to: receive match rule parameters comprising the match type and the one or more fields of applicability; create the unique identifier based on the one or more fields of applicability; and store a custom match rule comprising the unique identifier, the match type, and the one or more fields of applicability. 11. The system of claim 9 , the at least one processor further configured to: display an error message when the encrypted match index duplicates the previously generated and stored encrypted match index in the encrypted match index column. 12. The system of claim 9 , the at least one processor further configured to: scan the encrypted match index column to determine one or more duplicates; and display the one or more duplicates in a web interface. 13. The system of claim 9 , the at least one processor further configured to: store the match index value in an unencrypted form if encryption is not enabled on any of the one or more fields of applicability. 14. The system of claim 9 , wherein the match type is exact or fuzzy. 15. The system of claim 9 , wherein the match rule further comprises an indicator of whether a blank field should be treated as a match. 16. The system of claim 9 , wherein if the match type is fuzzy, the at least one processor can use one of Jaro-Winkler, Kullback-Liebler distance, name variant, keyboard distance, metaphone 3, or syllable alignment to determine a match. 17. The system of claim 9 , wherein the encryption scheme is a deterministic scheme. 18. The system of claim 9 , wherein the one or more fields of applicability can be standard or custom fields in the cloud computing platform. 19. The system of claim 9 , wherein the cloud computing platform is a customer relationship management platform. 20. A non-transitory computer-readable device having instructions stored thereon that, when executed by at least one computing device, causes the at least one computing device to perform operations comprising: receiving a new record added to a cloud computing platform; selecting a match rule applicable to the new record comprising a unique identifier, a match type, and one or more fields of applicability, wherein the match rule defines a duplicate record when performing duplicate detection on the new record; calculating an encrypted match index based on the one or more fields of applicability, wherein the unique identifier is used as an initialization vector in an encryption scheme used to create the encrypted match index; comparing the encrypted match index to an encrypted match index column to determine if the encrypted match index duplicates a value in the encrypted match index column; and when the encrypted match index does not duplicate a previously generated and stored encrypted match index in the encrypted match index column, storing the new record in the cloud computing platform and adding the encrypted match index to the encrypted match index column.
wherein the data content is protected, e.g. by encrypting or encapsulating the payload · CPC title
Information retrieval; Database structures therefor; File system structures therefor · CPC title
Ensuring data consistency and integrity · CPC title
Providing cryptographic facilities or services · CPC title
Vectors, bitmaps or matrices · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.