Intelligent data enrichment using knowledge graph

US2022237185A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022237185-A1
Application numberUS-202117160153-A
CountryUS
Kind codeA1
Filing dateJan 27, 2021
Priority dateJan 27, 2021
Publication dateJul 28, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method can receive a source table containing data tuples and a source schema defining attributes of the data tuples, and match the source schema to an ontology of a knowledge graph. The knowledge graph can include a plurality of instances and the ontology defines properties of the plurality of instances. The computer-implemented method can link the data tuples to respective instances in the knowledge graph, and identifying non-matching properties of the respective instances, wherein the non-matching properties are defined in the ontology and not matched to the source schema. The computer-implemented method can obtain property values associated with the non-matching properties from the knowledge graph, and add one or more of the non-matching properties and the associated property values to respective data tuples of the source table.

First claim

Opening claim text (preview).

1 . A computer-implemented method, implemented by one or more computing devices comprising at least one hardware processor and one or more tangible memories coupled to the at least one hardware processor, the method comprising: receiving a source table containing data tuples and a source schema defining attributes of the data tuples; matching the source schema to an ontology of a knowledge graph, wherein the knowledge graph comprises a plurality of instances and the ontology defines properties of the plurality of instances; linking the data tuples to respective instances in the knowledge graph; identifying non-matching properties of the respective instances, wherein the non-matching properties are defined in the ontology and not matched to the source schema; obtaining property values associated with the non-matching properties from the knowledge graph; and adding one or more of the non-matching properties and the associated property values to respective data tuples of the source table. 2 . The method of claim 1 , further comprising creating a ranked list of non-matching properties and the associated property values, wherein the creating comprises ranking the non-matching properties of the respective instances. 3 . The method of claim 2 , further comprising presenting the ranked list of non-matching properties and the associated property values on a user interface, and receiving an input from the user interface indicating which non-matching properties and the associated properties values are added to respective data tuples of the source table. 4 . The method of claim 2 , wherein ranking the non-matching properties of the respective instances comprises counting instances in the knowledge graph having property values associated with the non-matching properties. 5 . The method of claim 1 , further comprising logging data linkage information, wherein the data linkage information records at least which non-matching properties have been added to respective data tuples of the source table. 6 . The method of claim 5 , further comprising ranking the non-matching properties of the respective instances based on previously logged data linkage information. 7 . The method of claim 1 , further comprising assigning a null value to a data tuple of the source table if a corresponding non-matching property of the respective instance linked to the data tuple has no associated property value or if the data tuple is not linked to a respective instance in the knowledge graph. 8 . The method of claim 7 , further comprising ranking the non-matching properties of the respective instances based on minimizing a total number of null values assigned to the data tuple of the source table. 9 . The method of claim 1 , further comprising mapping the data tuples to respective property tuple lists, wherein a property tuple list corresponds to a respective instance linked to the data tuple and comprises one or more pairs of non-matching property and property value associated with the non-matching property. 10 . The method of claim 1 , wherein the knowledge graph is one of a plurality of knowledge graphs, wherein the method further comprises identifying non-matching properties and obtaining property values associated with the non-matching properties from the plurality of knowledge graphs. 11 . A system comprising: one or more processors; and memory coupled to the one or more processors comprising instructions causing the one or more processors implementing a table enrichment engine, wherein the table enrichment engine comprises: a user interface configured to receive a source table containing data tuples and a source schema defining attributes of the data tuples; a schema matching operator configured to match the source schema to an ontology of a knowledge graph, wherein the knowledge graph comprises a plurality of instances and the ontology defines properties of the plurality of instances; an instance linking operator configured to link the data tuples to respective instances in the knowledge graph; a new property finder configured to identify non-matching properties of the respective instances, wherein the non-matching properties are defined in the ontology and not matched to the source schema; a property value retriever configured to obtain property values associated with the non-matching properties from the knowledge graph; and an inserter configured to add one or more of the non-matching properties and the associated property values to respective data tuples of the source table. 12 . The system of claim 11 , wherein the table enrichment engine further comprises a ranking operator configured to rank the non-matching properties of the respective instances so as to create a ranked list of non-matching properties and the associated property values. 13 . The system of claim 12 , wherein the user interface is configured to present the ranked list of non-matching properties and the associated property values to a user, and receive an input from the user indicating which non-matching properties and the associated property values are added to respective data tuples of the source table. 14 . The system of claim 12 , wherein the ranking operator is configured to rank the non-matching properties of the respective instances based on counting instances in the knowledge graph having property values associated with the non-matching properties. 15 . The system of claim 12 , wherein the table enrichment engine further comprises a data logger configured to log data linkage information, wherein the data linkage information records at least which non-matching properties have been added to respective data tuples of the source table. 16 . The system of claim 15 , wherein the ranking operator is configured to rank the non-matching properties of the respective instances based on previously logged data linkage information. 17 . The system of claim 12 , wherein the inserter is configured to assign a null value to a data tuple of the source table if a corresponding non-matching property of the respective instance linked to the data tuple has no associated property value or if the data tuple is not linked to a respective instance in the knowledge graph. 18 . The system of claim 17 , wherein the ranking operator is configured to rank the non-matching properties of the respective instances based on minimizing a total number of null values assigned to the data tuple of the source table. 19 . The system of claim 11 , wherein the knowledge graph is one of a plurality of knowledge graphs, wherein the table enrichment engine is configured to identify non-matching properties and obtain property values associated with the non-matching properties from the plurality of knowledge graphs. 20 . One or more non-transitory computer-readable media having encoded thereon computer-executable instructions causing one or more processors to perform a method comprising: receiving a source table containing data tuples and a source schema defining attributes of the data tuples; matching the source schema to an ontology of a knowledge graph, wherein the knowledge graph comprises a plurality of instances and the ontology defines properties of the plurality of instances; linking the data tuples to respective instances in the knowledge graph; identifying non-matching properties of the respective instances, wherein the non-matching properties are defined in the ontology and not matched to the source schema; obtaining property values associated with the non-matching propertie

Assignees

Inventors

Classifications

  • Query processing support for facilitating data mining operations in structured databases · CPC title

  • G06F16/213Primary

    with details for schema evolution support · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022237185A1 cover?
A computer-implemented method can receive a source table containing data tuples and a source schema defining attributes of the data tuples, and match the source schema to an ontology of a knowledge graph. The knowledge graph can include a plurality of instances and the ontology defines properties of the plurality of instances. The computer-implemented method can link the data tuples to respecti…
Who is the assignee on this patent?
Sap Se
What technology area does this patent fall under?
Primary CPC classification G06F16/2465. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jul 28 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).