Dynamic threshold-based records linking

US12547663B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12547663-B2
Application numberUS-202217808740-A
CountryUS
Kind codeB2
Filing dateJun 24, 2022
Priority dateJun 24, 2022
Publication dateFeb 10, 2026
Grant dateFeb 10, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Records linking is provided. Two records are selected from a plurality of records corresponding to a customer for pair-wise record comparison. It is determined whether the two records are included in different entities. A local auto-link-threshold value of the different entities is identified in response to determining that the two records are included in different entities. An attribute comparison is performed between the two records. A comparison score is generated based on the attribute comparison between the two records. It is determined whether the comparison score is greater than the local auto-link-threshold value of the different entities. The two records are linked in response to determining that the comparison score is greater than the local auto-link-threshold value of the different entities.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method for records linking, the computer-implemented method comprising: accessing, by a computer in response to the computer receiving an input to perform a pair-wise record comparison of a plurality of records corresponding to a customer in a data management system from a plurality of data record sources, the plurality of records from the plurality of data record sources to generate a complete view of data records of the customer; selecting, by the computer, two records from the plurality of records from the plurality of data record sources corresponding to the customer in the data management system for the pair-wise record comparison; determining, by the computer, whether the two records are included in different entities such that a first record of the two records is included in a first entity and a second record of the two records is included in a second entity of the different entities; identifying, by the computer, a highest entity-based local auto-link-threshold value between the first entity and the second entity of the different entities in response to the computer determining that the two records are included in the different entities, wherein each respective entity in the data management system has its own entity-based local auto-link-threshold value; performing, by the computer, an attribute comparison between the two records; generating, by the computer, a comparison score based on the attribute comparison between the two records; determining, by the computer, whether the comparison score is greater than the highest entity-based local auto-link-threshold value between the first entity and the second entity of the different entities; linking, by the computer, the two records in response to the computer determining that the comparison score is greater than the highest entity-based local auto-link-threshold value between the first entity and the second entity of the different entities; and selecting, by the computer in response to the computer determining that there are more records in the plurality of records from the plurality of data record sources to be pair-wise compared, two more records from the plurality of records to continue the pair-wise record comparison of the plurality of records until the complete view of the data records of the customer is generated. 2 . The computer-implemented method of claim 1 , further comprising: preventing, by the computer, the linking of the two records in response to the computer determining that the comparison score is not greater than the highest entity-based local auto-link-threshold value between the first entity and the second entity of the different entities. 3 . The computer-implemented method of claim 1 further comprising: coalescing, by the computer, the different entities to form an aggregate entity; and utilizing, by the computer, the highest entity-based local auto-link-threshold value between the first entity and the second entity as an entity-based local auto-link-threshold value for the aggregate entity. 4 . The computer-implemented method of claim 3 further comprising: determining, by the computer, whether the entity-based local auto-link-threshold value of the aggregate entity is less than a defined maximum entity-based auto-link-threshold value; and increasing, by the computer in response to the computer determining that the entity-based local auto-link-threshold value of the aggregate entity is less than the defined maximum entity-based auto-link-threshold value, the entity-based local auto-link-threshold value of the aggregate entity whenever an entering record is added to the aggregate entity to only allow stronger matching records to be added to the aggregate entity based on subtracting a generated comparison score of the entering record from a self-score of a center record of the aggregate entity and multiplying that difference by a hyperparameter that controls a rate at which the entity-based local auto-link-threshold value of the aggregate entity increases. 5 . The computer-implemented method of claim 4 further comprising: preventing, by the computer, the increasing of the entity-based local auto-link-threshold value of the aggregate entity in response to the computer determining that the entity-based local auto-link-threshold value of the aggregate entity is not less than the defined maximum entity-based auto-link-threshold value. 6 . The computer-implemented method of claim 1 further comprising: determining, by the computer in response to the computer determining that the two records are not included in the different entities, whether one of the two records is included in a given entity in the data management system; identifying, by the computer in response to the computer determining that the one of the two records is included in the given entity, a particular entity-based local auto-link-threshold value of the given entity; determining, by the computer, whether the comparison score is greater than the particular entity-based local auto-link-threshold value of the given entity; linking, by the computer, the two records in response to the computer determining that the comparison score is greater than the particular entity-based local auto-link-threshold value of the given entity; and terminating, by the computer in response to determining that there are no more records in the plurality of records from the plurality of data record sources to be pair-wise compared, the pair-wise record comparison of the plurality of records from the plurality of data record sources corresponding to the customer in the data management system when the complete view of the data records of the customer is generated. 7 . The computer-implemented method of claim 6 further comprising: identifying, by the computer, a particular record of the two records that was not included in the given entity; and adding, by the computer, the particular record to the given entity. 8 . The computer-implemented method of claim 6 further comprising: utilizing, by the computer, a default entity-based auto-link-threshold value set by the customer in response to the computer determining that neither of the two records is included in any existing entity in the data management system; determining, by the computer, whether the comparison score is greater than the default entity-based auto-link-threshold value; and linking, by the computer, the two records to form a new entity in response to the computer determining that the comparison score is greater than the default entity-based auto-link-threshold value. 9 . The computer-implemented method of claim 8 further comprising: utilizing, by the computer, the default entity-based auto-link-threshold value as an initial entity-based auto-link-threshold value for the new entity formed by linking the two records in response to the computer determining that the comparison score is greater than the default entity-based auto-link-threshold value. 10 . A computer system for records linking, the computer system comprising: a bus system; a storage device connected to the bus system, wherein the storage device stores program instructions; and a processor connected to the bus system, wherein the processor executes the program instructions to: in response to receiving an input to perform a pair-wise record comparison of a plurality of records corresponding to a customer in a data management system from a plurality of data record sources, access the plurality of records from the plurality of data record sources to generate a complete view of data records of the customer; select two records from the plurality of records from the plurality of data recor

Assignees

Inventors

Classifications

  • G06F16/906Primary

    Clustering; Classification · CPC title

  • G06F16/215Primary

    Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12547663B2 cover?
Records linking is provided. Two records are selected from a plurality of records corresponding to a customer for pair-wise record comparison. It is determined whether the two records are included in different entities. A local auto-link-threshold value of the different entities is identified in response to determining that the two records are included in different entities. An attribute compar…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/906. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 10 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).