Using self-information scores for entities to determine whether to perform entity resolution
US-2018365338-A1 · Dec 20, 2018 · US
US12547663B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12547663-B2 |
| Application number | US-202217808740-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 24, 2022 |
| Priority date | Jun 24, 2022 |
| Publication date | Feb 10, 2026 |
| Grant date | Feb 10, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Records linking is provided. Two records are selected from a plurality of records corresponding to a customer for pair-wise record comparison. It is determined whether the two records are included in different entities. A local auto-link-threshold value of the different entities is identified in response to determining that the two records are included in different entities. An attribute comparison is performed between the two records. A comparison score is generated based on the attribute comparison between the two records. It is determined whether the comparison score is greater than the local auto-link-threshold value of the different entities. The two records are linked in response to determining that the comparison score is greater than the local auto-link-threshold value of the different entities.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method for records linking, the computer-implemented method comprising: accessing, by a computer in response to the computer receiving an input to perform a pair-wise record comparison of a plurality of records corresponding to a customer in a data management system from a plurality of data record sources, the plurality of records from the plurality of data record sources to generate a complete view of data records of the customer; selecting, by the computer, two records from the plurality of records from the plurality of data record sources corresponding to the customer in the data management system for the pair-wise record comparison; determining, by the computer, whether the two records are included in different entities such that a first record of the two records is included in a first entity and a second record of the two records is included in a second entity of the different entities; identifying, by the computer, a highest entity-based local auto-link-threshold value between the first entity and the second entity of the different entities in response to the computer determining that the two records are included in the different entities, wherein each respective entity in the data management system has its own entity-based local auto-link-threshold value; performing, by the computer, an attribute comparison between the two records; generating, by the computer, a comparison score based on the attribute comparison between the two records; determining, by the computer, whether the comparison score is greater than the highest entity-based local auto-link-threshold value between the first entity and the second entity of the different entities; linking, by the computer, the two records in response to the computer determining that the comparison score is greater than the highest entity-based local auto-link-threshold value between the first entity and the second entity of the different entities; and selecting, by the computer in response to the computer determining that there are more records in the plurality of records from the plurality of data record sources to be pair-wise compared, two more records from the plurality of records to continue the pair-wise record comparison of the plurality of records until the complete view of the data records of the customer is generated. 2 . The computer-implemented method of claim 1 , further comprising: preventing, by the computer, the linking of the two records in response to the computer determining that the comparison score is not greater than the highest entity-based local auto-link-threshold value between the first entity and the second entity of the different entities. 3 . The computer-implemented method of claim 1 further comprising: coalescing, by the computer, the different entities to form an aggregate entity; and utilizing, by the computer, the highest entity-based local auto-link-threshold value between the first entity and the second entity as an entity-based local auto-link-threshold value for the aggregate entity. 4 . The computer-implemented method of claim 3 further comprising: determining, by the computer, whether the entity-based local auto-link-threshold value of the aggregate entity is less than a defined maximum entity-based auto-link-threshold value; and increasing, by the computer in response to the computer determining that the entity-based local auto-link-threshold value of the aggregate entity is less than the defined maximum entity-based auto-link-threshold value, the entity-based local auto-link-threshold value of the aggregate entity whenever an entering record is added to the aggregate entity to only allow stronger matching records to be added to the aggregate entity based on subtracting a generated comparison score of the entering record from a self-score of a center record of the aggregate entity and multiplying that difference by a hyperparameter that controls a rate at which the entity-based local auto-link-threshold value of the aggregate entity increases. 5 . The computer-implemented method of claim 4 further comprising: preventing, by the computer, the increasing of the entity-based local auto-link-threshold value of the aggregate entity in response to the computer determining that the entity-based local auto-link-threshold value of the aggregate entity is not less than the defined maximum entity-based auto-link-threshold value. 6 . The computer-implemented method of claim 1 further comprising: determining, by the computer in response to the computer determining that the two records are not included in the different entities, whether one of the two records is included in a given entity in the data management system; identifying, by the computer in response to the computer determining that the one of the two records is included in the given entity, a particular entity-based local auto-link-threshold value of the given entity; determining, by the computer, whether the comparison score is greater than the particular entity-based local auto-link-threshold value of the given entity; linking, by the computer, the two records in response to the computer determining that the comparison score is greater than the particular entity-based local auto-link-threshold value of the given entity; and terminating, by the computer in response to determining that there are no more records in the plurality of records from the plurality of data record sources to be pair-wise compared, the pair-wise record comparison of the plurality of records from the plurality of data record sources corresponding to the customer in the data management system when the complete view of the data records of the customer is generated. 7 . The computer-implemented method of claim 6 further comprising: identifying, by the computer, a particular record of the two records that was not included in the given entity; and adding, by the computer, the particular record to the given entity. 8 . The computer-implemented method of claim 6 further comprising: utilizing, by the computer, a default entity-based auto-link-threshold value set by the customer in response to the computer determining that neither of the two records is included in any existing entity in the data management system; determining, by the computer, whether the comparison score is greater than the default entity-based auto-link-threshold value; and linking, by the computer, the two records to form a new entity in response to the computer determining that the comparison score is greater than the default entity-based auto-link-threshold value. 9 . The computer-implemented method of claim 8 further comprising: utilizing, by the computer, the default entity-based auto-link-threshold value as an initial entity-based auto-link-threshold value for the new entity formed by linking the two records in response to the computer determining that the comparison score is greater than the default entity-based auto-link-threshold value. 10 . A computer system for records linking, the computer system comprising: a bus system; a storage device connected to the bus system, wherein the storage device stores program instructions; and a processor connected to the bus system, wherein the processor executes the program instructions to: in response to receiving an input to perform a pair-wise record comparison of a plurality of records corresponding to a customer in a data management system from a plurality of data record sources, access the plurality of records from the plurality of data record sources to generate a complete view of data records of the customer; select two records from the plurality of records from the plurality of data recor
Clustering; Classification · CPC title
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.