Systems and methods for performing data analytics on sensitive data in remote network environments without exposing content of the sensitive data

US12526404B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12526404-B2
Application numberUS-202418920737-A
CountryUS
Kind codeB2
Filing dateOct 18, 2024
Priority dateApr 12, 2023
Publication dateJan 13, 2026
Grant dateJan 13, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for performing data analytics on sensitive data, such as deduplication are disclosed herein. A first request to perform a first data analytics operation for a set of sensitive data instances may be received. A first set of image representations for the set of sensitive data instances may be retrieved. A first sensitive data instance and a second sensitive data instance may be clustered into a first cluster. The first data analytics operation may be performed on the first cluster.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system for performing data analytics on sensitive data in remote network environments without exposing content of the sensitive data by comparing sets of encrypted data, the system comprising: one or more processors; and a non-transitory, computer-readable medium storing instructions that, when executed by the one or more processors, cause operations comprising: retrieving a first encrypted mappings for a set of sensitive data instances, wherein the first encrypted mappings comprises: a first encrypted mapping for a first sensitive data instance for the set of sensitive data instances, wherein the first encrypted mapping comprises a first set of alphanumeric characters in the first sensitive data instance mapped using a first coding; and a second encrypted mapping for a second sensitive data instance for the set of sensitive data instances; clustering the first sensitive data instance and the second sensitive data instance into a first cluster based on similarities between the first encrypted mapping and the second encrypted mapping; retrieving a first instance identifier for the first sensitive data instance and a second instance identifier for the second sensitive data instance; and labeling the first instance identifier and the second instance identifier as corresponding to a duplicate. 2 . A method for performing data analytics on sensitive data in remote network environments without exposing content of the sensitive data, the method comprising: retrieving a first encrypted mappings for a set of sensitive data instances, wherein the first encrypted mappings comprises: a first encrypted mapping for a first sensitive data instance for the set of sensitive data instances, wherein the first encrypted mapping comprises a first set of alphanumeric characters in the first sensitive data instance mapped using a first coding; and a second encrypted mapping for a second sensitive data instance for the set of sensitive data instances, wherein the second encrypted mapping comprises a second set of alphanumeric characters in the second sensitive data instance mapped using the first coding; clustering the first sensitive data instance and the second sensitive data instance into a first cluster based on similarities between the first encrypted mapping and the second encrypted mapping; and performing a first data analytics operation on the first cluster. 3 . The method of claim 2 , wherein the first set of alphanumeric characters in the first sensitive data instance is mapped using the first coding using a first mapping template, and wherein the first mapping template maps the first set of alphanumeric characters in the first sensitive data instance using the first coding by: indicating a first graphical element corresponding to a first alphanumeric character; and replacing the first alphanumeric character in the first set of alphanumeric characters with the first graphical element to generate the first encrypted mapping. 4 . The method of claim 2 , wherein the first set of alphanumeric characters in the first sensitive data instance is mapped using the first coding using a first mapping template, and wherein the first mapping template maps the first set of alphanumeric characters in the first sensitive data instance using the first coding by: indicating a first dimensional characteristic corresponding to a first alphanumeric character; and replacing the first alphanumeric character in the first set of alphanumeric characters with the first dimensional characteristic to generate the first encrypted mapping. 5 . The method of claim 2 , wherein the first set of alphanumeric characters in the first sensitive data instance is mapped using the first coding using a first mapping template, and wherein the first mapping template maps the first set of alphanumeric characters in the first sensitive data instance using the first coding by: indicating a first mapping characteristic corresponding to a first alphanumeric character; and replacing the first alphanumeric character in the first set of alphanumeric characters with the first mapping characteristic to generate the first encrypted mapping. 6 . The method of claim 2 , wherein the first set of alphanumeric characters in the first sensitive data instance is mapped using the first coding using a first mapping template, and wherein the first mapping template maps the first set of alphanumeric characters in the first sensitive data instance using the first coding by: indicating a first position in a vector space corresponding to a first alphanumeric character; and replacing the first alphanumeric character in the first set of alphanumeric characters with the first position in the vector space to generate the first encrypted mapping. 7 . The method of claim 2 , wherein the first set of alphanumeric characters in the first sensitive data instance is mapped using the first coding using a first mapping template, and wherein the first mapping template maps the first set of alphanumeric characters in the first sensitive data instance using the first coding by: indicating a first distance between positions in a vector space corresponding to a first alphanumeric character; and replacing the first alphanumeric character in the first set of alphanumeric characters with the first distance between positions in the vector space to generate the first encrypted mapping. 8 . The method of claim 2 , wherein retrieving the first encrypted mappings for the set of sensitive data instances further comprises: generating the first encrypted mapping at a first device in a network; and generating the second encrypted mapping at a second device in the network. 9 . The method of claim 2 , wherein performing the first data analytics operation on the first cluster further comprises: retrieving a first instance identifier for the first sensitive data instance; retrieving a second instance identifier for the second sensitive data instance; and labeling the first instance identifier and the second instance identifier as corresponding to a duplicate. 10 . The method of claim 2 , wherein performing the first data analytics operation on the first cluster further comprises: determining a privilege required to access the first sensitive data instance and the second sensitive data instance; determining a user with the privilege; and transmitting the first sensitive data instance and the second sensitive data instance to the user. 11 . The method of claim 2 , wherein retrieving the first encrypted mappings for the set of sensitive data instances further comprises: retrieving first text corresponding to the first sensitive data instance; and generating a modified first text by removing a special character from the first text, wherein the modified first text comprises the first set of alphanumeric characters. 12 . The method of claim 2 , wherein the first set of alphanumeric characters in the first sensitive data instance is mapped using the first coding using a first mapping template, and wherein the first mapping template maps the first set of alphanumeric characters in the first sensitive data instance using the first coding by: indicating a first color corresponding to a first alphanumeric character; and replacing the first alphanumeric character in the first set of alphanumeric characters with the first color to generate the first encrypted mapping. 13 . The method of claim 2 , wherein the first set of alphanumeric characters in the first sensitive data instance is mapped using the first coding using a first mapping template, and wherein the first mapping templat

Assignees

Inventors

Classifications

  • using extracted text · CPC title

  • using information manually generated, e.g. tags, keywords, comments, manually generated location and time information · CPC title

  • Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

  • using colour · CPC title

  • Clustering; Classification · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12526404B2 cover?
Systems and methods for performing data analytics on sensitive data, such as deduplication are disclosed herein. A first request to perform a first data analytics operation for a set of sensitive data instances may be received. A first set of image representations for the set of sensitive data instances may be retrieved. A first sensitive data instance and a second sensitive data instance may b…
Who is the assignee on this patent?
Citibank Na
What technology area does this patent fall under?
Primary CPC classification G06F21/6245. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 13 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).