Entity resolution of master data using qualified relationship score

US2022012219A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022012219-A1
Application numberUS-202016927258-A
CountryUS
Kind codeA1
Filing dateJul 13, 2020
Priority dateJul 13, 2020
Publication dateJan 13, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A first score associated with matching between entity records of a plurality of entities of master data of an MDM system is received. A set of entity records with a first score above a lower threshold score and below an upper threshold score is identified as unresolved; neither confirmed as matched or unmatched. A second score associated with relationships between entity records is generated. Overall scores for pairs of the set of entity records are determined by combining the first matching score with the second relationship score. The overall score of respective pairs of the set of entities is compared to the upper threshold, and if the upper threshold is exceeded, then the information of the pair of entity records of the set of entity records are combined into a single record, and redundant entity records are removed from the MDM system.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for resolving entity records of a master data management (MDM) system, the method comprising: receiving, by one or more processors, a first score associated with matching between respective entity records of a plurality of entities of master data of an MDM system; identifying, by the one or more processors, a set of unresolved entity records, wherein the first score between pairings of respective entity records of the set of unresolved entity records is above a lower threshold score and below an upper threshold score; generating, by the one or more processors, a second score associated with a relationship between the pairings of the respective entity records of the unresolved entity records, based on relationship data of the plurality of entities added to the master data of the MDM system; generating, by the one or more processors, an overall score by combining the first score and the second score for the pairings of the respective entity records of the set of unresolved entity records; determining, by the one or more processors, whether the overall score associated with the pairings of the respective entity records of the set of unresolved entity records exceeds the upper threshold; and responsive to the overall score of the pair of the respective entity records of the set of unresolved entity records exceeding the upper threshold, combining, by the one or more processors, information of the pairings of the respective entity records of the set of unresolved entity records into a single entity record. 2 . The method of claim 1 , wherein the first score is generated exclusive of relationship information, hierarchy information, and grouping information of the respective entity records of the plurality of entities of the master data of the MDM system. 3 . The method of claim 1 , wherein the second score is based on additional information of relationship, grouping, and hierarchy information associated with entities of the MDM system. 4 . The method of claim 1 , further comprising: performing, by the one or more processors, a matching assessment of a first entity of the plurality of entities of the MDM system with a second entity of the plurality of entities, for each entity of the plurality of entities; and generating, by the one or more processors, the first score associated with matching of the first entity of the plurality of entities with the second entity of the plurality of entities. 5 . The method of claim 1 , wherein the second score is based on qualified data of relationship information, hierarchy information, and grouping information associated with the pairings of the unresolved entity records, respectively, and includes weighting factors for a determination of a relationship of the pairings of the unresolved entity records, respectively, with a third entity, and weighting factors for a determination of no relationship. 6 . The method of claim 1 , further comprising: removing, by the one or more processors, redundant entity records from the master data of the MDM system in response to combining information of the pairings of the respective entity records of the set of unresolved entity records into the single entity record. 7 . The method of claim 1 , further comprising: creating, by the one or more processors, a machine learning model that generates the second score associated with the relationship between the pairings of the set of unresolved entity records, respectively, based on the relationship data of the plurality of entities; receiving, by the one or more processors, second scores and weighting factors corresponding to relationship types, hierarchy conditions and common grouping attributes of the set of unresolved entity records; training, by the one or more processors, the machine learning model by applying the second scores and the weighting factors corresponding to the relationship types, hierarchy conditions and common grouping attributes of the set of unresolved entity records as supervised learning; and applying, by the one or more processors, the machine learning model, trained by the second scores and the weighting factors of the set of unresolved entity records, to a new set of unresolved entity records. 8 . A computer program product for resolving entity records of a master data management (MDM) system, the computer system comprising: one or more computer-readable storage media; program instructions stored on the one or more computer-readable storage media, the program instructions comprising: program instructions to receive a first score associated with matching between respective entity records of a plurality of entities of master data of an MDM system; program instructions to identify a set of unresolved entity records, wherein the first score between pairings of respective entity records of the set of unresolved entity records is above a lower threshold score and below an upper threshold score; program instructions to generate a second score associated with a relationship between the pairings of the respective entity records of the unresolved entity records, based on relationship data of the plurality of entities added to the master data of the MDM system; program instructions to generate an overall score by combining the first score and the second score for the pairings of entity records for the respective entity records of the set of unresolved entity records; program instructions to determine whether the overall score associated with the pairings of the respective entity records of the set of unresolved entity records exceeds the upper threshold; and responsive to the overall score of the pair of the respective entity records of the set of unresolved entity records exceeding the upper threshold, program instructions to combine information of the pairings of the respective entity records of the set of unresolved entity records into a single entity record. 9 . The computer program product of claim 8 , wherein the first score is generated exclusive of relationship information, hierarchy information, and grouping information of the respective entity records of the plurality of entities of the master data of the MDM system. 10 . The computer program product of claim 8 , wherein the second score is based on additional information of relationship, grouping, and hierarchy information associated with entities of the MDM system. 11 . The computer program product of claim 8 , further comprising: program instructions to perform a matching assessment of a first entity of the plurality of entities of the MDM system with a second entity of the plurality of entities, for each entity of the plurality of entities; and program instructions to generate the first score associated with matching of the first entity of the plurality of entities with the second entity of the plurality of entities. 12 . The computer program product of claim 8 , wherein the second score is based on relationship information, hierarchical information, and grouping information associated with the pairings of the unresolved entity records, respectively, and includes weighting factors for a determination of a relationship of the pairings of the unresolved entity records, respectively, with a third entity, and weighting factors for a determination of no relationship. 13 . The computer program product of claim 8 , further comprising: program instructions to remove redundant entity records from the master data of the MDM system in response to combining information of the pairings of the respective entity records of the set of unresolved entity records into the single entity record.

Assignees

Inventors

Classifications

  • Machine learning · CPC title

  • G06F16/215Primary

    Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

  • G06F16/288Primary

    Entity relationship models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022012219A1 cover?
A first score associated with matching between entity records of a plurality of entities of master data of an MDM system is received. A set of entity records with a first score above a lower threshold score and below an upper threshold score is identified as unresolved; neither confirmed as matched or unmatched. A second score associated with relationships between entity records is generated. O…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/215. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).