Method and system for indexing, relating and managing information about entities
US-9600563-B2 · Mar 21, 2017 · US
US10698755B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10698755-B2 |
| Application number | US-201414290030-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 29, 2014 |
| Priority date | Sep 28, 2007 |
| Publication date | Jun 30, 2020 |
| Grant date | Jun 30, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments disclosed herein provide a system and method for analyzing an identity hub. Particularly, a user can connect to the identity hub, load an initial set of data records, create and/or edit an identity hub configuration locally, analyze and/or validate the configuration via a set of analysis tools, including an entity analysis tool, a data analysis tool, a bucket analysis tool, and a linkage analysis tool, and remotely deploy the validated configuration to an identity hub instance. In some embodiments, through a graphical user interface, these analysis tools enable the user to analyze and modify the configuration of the identity hub in real time while the identity hub is operating to ensure data quality and enhance system performance.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for analyzing a system for matching data records, the method comprising: producing a configuration of said system for matching data records, the configuration of the system including a bucketing strategy employing matching functions and matching parameters to create buckets containing data records, wherein said buckets are created by comparing sets of one or more attributes of initial data records with corresponding attributes of candidate data records in said system, wherein each bucket is associated with a corresponding set of attributes, and wherein data records associated with a same entity are determined and linked by comparing one or more attributes of the initial data records to corresponding attributes of the candidate data records within the buckets in accordance with the matching functions and matching parameters; applying said configuration to said system and analyzing buckets created during operation of said system according to the bucketing strategy associated with said configuration of said system; analyzing an effect of said buckets on throughput of said system via a bucket analysis tool providing a user interface, wherein analyzing an effect of said buckets further comprises: executing one or more queries from the user interface of the bucket analysis tool to produce characteristics associated with the buckets created during operation of the system, wherein the characteristics include distribution of data within the created buckets and data records not placed in the created buckets; and identifying performance issues of the system from the characteristics of the created buckets produced from the one or more queries; and modifying said configuration during operation of said system to adjust distribution of the data records within said buckets in real time to address the identified performance issues and enable the throughput of said system to reside within a predetermined desired range, wherein modifying said configuration includes: changing said matching functions and matching parameters of said bucketing strategy for creating said buckets based on said identified performance issues to alter the comparing of said attributes and determination of the association of data records with the same entity for said buckets, wherein changing said matching functions and matching parameters includes providing a different combination of attributes for the corresponding set of attributes for at least one bucket. 2. The method of claim 1 , wherein said changing said matching functions and matching parameters of said bucketing strategy further comprises editing an algorithm utilized in creating said buckets or changing one or more parameter values associated with said algorithm. 3. The method of claim 1 , wherein said modifying said configuration further comprises: estimating performance of said system with said modified configuration under a real time load via the bucket analysis tool to ensure the throughput of said system resides within said predetermined desired range. 4. The method of claim 2 , wherein said algorithm is associated with an entity type, and said method further comprises analyzing entities categorized as having said entity type in said system. 5. The method of claim 4 , wherein said analyzing said entities further comprises one or more from a group of analyzing an entity size distribution, analyzing said entities by size, analyzing said entities by composition, analyzing a score distribution associated with said entities, and analyzing member comparisons associated with said entities. 6. The method of claim 1 , further comprising analyzing validity of attributes of said initial data records. 7. The method of claim 1 , wherein said analyzing said buckets further comprises one or more from a group of analyzing statistics associated with said buckets, analyzing a bucket size distribution, analyzing said buckets by size, analyzing said buckets by composition, analyzing a bulk cross match comparison distribution, analyzing members by bucket count, analyzing member bucket values, analyzing member bucket frequencies, and analyzing a member comparison distribution. 8. The method of claim 1 , further comprising analyzing error rates associated with said initial data records, wherein said error rates comprise a record error rate and a person error rate. 9. The method of claim 1 , wherein said configuration of said system comprises a clerical review threshold and an autolink threshold, and wherein said clerical review threshold and said autolink threshold are indicative of tolerance of said system to false positive and false negative rates in matching said initial data records, further comprising analyzing said clerical review threshold and said autolink threshold. 10. A system for analyzing an identity system for matching data records, the system comprising: at least one processor with logic to: produce a configuration of said identity system for matching data records, the configuration of the identity system including a bucketing strategy employing matching functions and matching parameters to create buckets containing data records, wherein said buckets are created by comparing sets of one or more attributes of initial data records with corresponding attributes of candidate data records in said identity system, wherein each bucket is associated with a corresponding set of attributes, and wherein data records associated with a same entity are determined and linked by comparing one or more attributes of the initial data records to corresponding attributes of the candidate data records within the buckets in accordance with the matching functions and matching parameters; apply said configuration to said identity system and analyze buckets created during operation of said identity system according to the bucketing strategy associated with said configuration of said identity system; analyze an effect of said buckets on throughput of said identity system via a bucket analysis tool providing a user interface, wherein analyzing an effect of said buckets further comprises: executing one or more queries from the user interface of the bucket analysis tool to produce characteristics associated with the buckets created during operation of the identity system, wherein the characteristics include distribution of data within the created buckets and data records not placed in the created buckets; and identifying performance issues of the identity system from the characteristics of the created buckets produced from the one or more queries; and modify said configuration during operation of said identity system to adjust distribution of the data records within said buckets in real time to address the identified performance issues and enable the throughput of said identity system to reside within a predetermined desired range, wherein modifying said configuration includes: changing said matching functions and matching parameters of said bucketing strategy for creating said buckets based on said identified performance issues to alter the comparing of said attributes and determination of the association of data records with the same entity for said buckets, wherein changing said matching functions and matching parameters includes providing a different combination of attributes for the corresponding set of attributes for at least one bucket. 11. The system of claim 10 , wherein said at least one processor further displays an algorithm editor through which an algorithm utilized in creating said buckets is edited. 12. The system of claim 10 , wherein said bucketing strategy is associated with an entity type, and wherein said at least o
Approximate or statistical queries · CPC title
Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title
Readable error formats, e.g. cross-platform generic formats, human understandable formats · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.