Machine learning system for automated attribute name mapping between source data models and destination data models

US11556508B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-11556508-B1
Application numberUS-202016895163-A
CountryUS
Kind codeB1
Filing dateJun 8, 2020
Priority dateJun 8, 2020
Publication dateJan 17, 2023
Grant dateJan 17, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method of mapping attribute names of a source data model to a destination data model includes obtaining multiple source attribute names from the source data model, and obtaining multiple destination attribute names from the destination data model. The destination data model includes multiple attributes that correspond to attributes in the source data model having different attribute names. The method includes processing the obtained source attribute names and the obtained destination attribute names to standardize the attribute names according to specified character formatting, supplying the standardized attribute names to a machine learning network model to predict a mapping of each source attribute name to a corresponding one of the destination attribute names, and outputting, according to mapping results of the machine learning network model, an attribute mapping table indicating the predicted destination attribute name corresponding to each source attribute name.

First claim

Opening claim text (preview).

What is claimed is: 1. A computerized method of mapping attribute names of a source data model to a destination data model, the method comprising: obtaining multiple source attribute names from the source data model; obtaining multiple destination attribute names from the destination data model, wherein the destination data model includes multiple attributes that correspond to attributes in the source data model having different attribute names; processing the obtained source attribute names and the obtained destination attribute names to standardize the attribute names according to specified character formatting; supplying the standardized attribute names to a machine learning network model to predict a mapping of each source attribute name to a corresponding one of the destination attribute names; outputting, according to mapping results of the machine learning network model, an attribute mapping table indicating the predicted destination attribute name corresponding to each source attribute name; and in response to determining that the machine learning network model is unable to predict a mapping for one of the source attribute names: obtaining a matching input from a user identifying one of the destination attribute names that corresponds to an unmapped source attribute name; and updating a custom dictionary to include the matching input obtained from the user, wherein supplying the standardized attribute names to the machine learning network model includes: determining whether any of the source attribute names have an existing entry in the custom dictionary; for each source attribute name that has an existing entry in the custom dictionary, recording an output mapping for the source attribute name according to the existing entry in the custom dictionary; and for each source attribute name that does not have an existing entry in the custom dictionary, using the machine learning model to predict the mapping of each source attribute name to the corresponding one of the destination attribute names. 2. The method of claim 1 , further comprising repeating multiple cycles of processing each source attribute name by predicting a corresponding destination attribute name using the machine learning network model or identifying a matching existing entry in the custom dictionary, until the attribute mapping table includes a mapped destination attribute name for a threshold number of source attribute names. 3. The method of claim 2 , wherein the threshold number of source attribute names includes one hundred percent of the source attribute names. 4. The method of claim 1 , wherein supplying the standardized attribute names to a machine learning network model includes: selecting a first one of the source attribute names and using the machine learning network model to identify one of the destination attribute names having a highest likelihood of correspondence with the selected first source attribute name; recording a mapping of the selected first source attribute name to the identified destination attribute name; and iteratively proceeding through remaining source attribute names to identify and record a corresponding destination attribute name for each source attribute name and returning a match percentage based on a confidence interval. 5. The method of claim 1 , wherein processing the obtained source attribute names and the obtained destination attribute names to standardize the attribute names includes at least one of: removing camel cases from the source attribute names and the destination attribute names; removing spaces from the source attribute names and the destination attribute names; removing underscores from the source attribute names and the destination attribute names; and removing special characters from the source attribute names and the destination attribute names. 6. The method of claim 1 , wherein the machine learning network model includes a natural language processing (NLP)-based algorithm. 7. The method of claim 1 , wherein the machine learning network model includes a long short-term memory (LSTM) network. 8. The method of claim 1 , wherein the source data model and the destination data model belong to different database schemas within a distributed data storage system. 9. The method of claim 8 , wherein the source data model belongs to one of a data lake bounded context and a de-normalized data zone bounded context of a health care publisher platform, and the destination model belongs to one of the de-normalized data zone bounded context and a subscribing client entity bounded context of the health care publisher platform. 10. A computer system comprising: memory configured to store a source data model, a destination data model, a machine learning network model, and computer-executable instructions and at least one processor configured to execute the instructions, wherein the instructions include: obtaining multiple source attribute names from the source data model; obtaining multiple destination attribute names from the destination data model, wherein the destination data model includes multiple attributes that correspond to attributes in the source data model having different attribute names; processing the obtained source attribute names and the obtained destination attribute names to standardize the attribute names according to specified character formatting; supplying the standardized attribute names to a machine learning network model to predict a mapping of each source attribute name to a corresponding one of the destination attribute names; outputting, according to mapping results of the machine learning network model, an attribute mapping table indicating the predicted destination attribute name corresponding to each source attribute name; and in response to determining that the machine learning network model is unable to predict a mapping for one of the source attribute names: receiving a matching input from a user identifying one of the destination attribute names that corresponds to an unmapped source attribute name; and updating a custom dictionary to include the matching input received from the user, wherein supplying the standardized attribute names to the machine learning network model includes: determining whether any of the source attribute names have an existing entry in the custom dictionary; for each source attribute name that has an existing entry in the custom dictionary, recording an output mapping for the source attribute name according to the existing entry in the custom dictionary; and for each source attribute name that does not have an existing entry in the custom dictionary, using the machine learning network model to predict the mapping of the source attribute name to the corresponding one of the destination attribute names. 11. The computer system of claim 10 , wherein the instructions include repeating multiple cycles of processing each source attribute name by predicting a corresponding destination attribute name using the machine learning network model or identifying a matching existing entry in the custom dictionary, until the attribute mapping table includes a mapped destination attribute name for a threshold number of source attribute names. 12. The computer system of claim 11 , wherein the threshold number of source attribute names includes one hundred percent of the source attribute names. 13. The computer system of claim 10 , wherein supplying the standardized attribute names to a machine learning network model includes: selecting a first one of the source attribute names and using the machine learning network model to identify one of the destination attribute

Assignees

Inventors

Classifications

  • for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms · CPC title

  • Machine learning · CPC title

  • Architecture, e.g. interconnection topology · CPC title

  • Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP · CPC title

  • G06F40/242Primary

    Dictionaries · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11556508B1 cover?
A computer-implemented method of mapping attribute names of a source data model to a destination data model includes obtaining multiple source attribute names from the source data model, and obtaining multiple destination attribute names from the destination data model. The destination data model includes multiple attributes that correspond to attributes in the source data model having differen…
Who is the assignee on this patent?
Cigna Intellectual Property Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/242. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).