Method and apparatus for processing electronic data

US9773053B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9773053-B2
Application numberUS-201113996840-A
CountryUS
Kind codeB2
Filing dateDec 23, 2011
Priority dateDec 23, 2010
Publication dateSep 26, 2017
Grant dateSep 26, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system ( 100 ) for generating a computer readable data file representative of a mapping between a first representation of a set of concepts or of a data structure (e.g. a database schema) and a second representation of a set of concepts or of a data structure (e.g. an ontology), each representation comprising a plurality of complex representational elements (e.g. tables in a database schema and concepts in an ontology) each of which may itself include a number of associated subordinate representational elements (e.g. columns/fields of a table in a database schema and attributes of a concept in an ontology). The system ( 100 ) includes a semantic similarity calculation module ( 134 ) for calculating a semantic similarity measure between a subordinate element of the first representation and each of the subordinate elements in the second representation and a mapping generation module ( 137 ) for generating a mapping between the subordinate element of the first representation and one of the subordinate elements of the second representation selected in dependence upon the calculated semantic similarity measures between the subordinate elements.

First claim

Opening claim text (preview).

The invention claimed is: 1. A data integration method of integrating data from a first and a second heterogeneous data source, each heterogeneous data source taking the form of an electronic database, the method comprising implementing a first wrapper around the first heterogeneous data source, the first wrapper being configured to convert requests and responses between a common format and one specific to the first data source and implementing a second wrapper around the second heterogeneous data source, the second wrapper being configured to convert requests and responses between the common format and one specific to the second data source; wherein each wrapper includes a mapping in the form of a computer readable data file automatically generated according to a method of generating a computer readable data file, on a computer system comprising a digital processor and a memory, the computer readable data file being representative of a mapping between a first representation of a set of concepts or of a data structure associated with the common format and a second representation of a set of concepts or of a data structure associated with a respective one of the first and second data sources, each representation comprising a plurality of complex representational elements which include a number of associated subordinate representational elements, the method of generating a computer readable data file comprising: the computer system calculating a semantic similarity measure between a subordinate element of the first representation and each of the subordinate elements in the second representation; and the computer system generating a mapping between the subordinate element of the first representation and one of the subordinate elements of the second representation selected in dependence upon the calculated semantic similarity measures between the subordinate elements; wherein calculation of a semantic similarity measure by the computer system includes: the computer system using a linked top ontology data structure stored within the memory of the computer system, the stored data structure comprising a plurality of concept nodes arranged to form a top ontology, the top ontology being a partial subset of a full ontology having at least twice as many nodes as the top ontology, the nodes in the top ontology being selected from the full ontology based on their ancestral closeness to a root node and/or their ancestral remoteness from a leaf node of the full ontology, the linked top ontology further comprising a plurality of pre-processed vocabulary terms each of which is linked to one or more of the nodes in the top ontology, the linked top ontology data structure being used by the computer system as follows: the names of the subordinate elements between whom a semantic similarity is being calculated being compared by the computer system with the vocabulary terms and for any vocabulary terms which match the names of the subordinate elements, the computer system identifying the top ontology nodes associated with the matched vocabulary terms and comparing the identified top ontology nodes associated with each name of the subordinate elements, and the computer system determining a semantic similarity based on the degree of commonality between the top ontology nodes associated with each of the subordinate elements. 2. The method according to claim 1 wherein the calculation of a semantic similarity further includes the computer system comparing the names of the complex representational elements with vocabulary terms and identifying the top ontology nodes associated with any matched names and determining the degree of commonality between on the one hand the identified top ontology nodes associated with either one of the subordinate elements and its associated complex representational element and, on the other hand, the other subordinate element and its associated complex representational element. 3. The method according to claim 1 further comprising the computer system performing steps of matching names to vocabulary terms, identifying the top ontology nodes associated with any matched vocabulary terms and determining a degree of commonality between the so identified top ontology nodes in respect of the names of the complex representational elements associated with or which include the respective subordinate elements between which the semantic similarity is being calculated and the converse subordinate elements, and using the degree of commonality determined between these complex elements and their converse subordinate elements as a factor in the determination of overall semantic distance. 4. The data integration method according to claim 1 further comprising: receiving a complex query from a human user or from a computer application, the complex query being expressed in the common format, processing the complex query to form a first sub query for sending to the first heterogeneous data source and a second sub-query for sending to the second heterogeneous data source, sending the first sub-query to the first data source via the first wrapper, the first wrapper converting the first sub-query from the common format to the format specific to the first data source, and sending the second sub-query to the second data source via the second wrapper, the second wrapper converting the second sub-query from the common format to the format specific to the second data source, receiving a first reply to the first sub-query from the first data source in the format specific to the first data source via the first wrapper which converts the first reply from the format specific to the first data source into the common format, receiving a second reply to the second sub-query from the second data source in the format specific to the second data source via the second wrapper which converts the second reply from the format specific to the second data source into the common format, combining the first and second responses together to form a complex response expressed in the common format, and returning the complex response to the requesting human user or computer application. 5. A data integration system including a mapping generating computer system and further including first and second heterogeneous data sources, each data source taking the form of an electronic database, and a first wrapper for wrapping around the first data source and a second wrapper for wrapping around the second data source, wherein the first wrapper is configured to convert requests and responses between a common format and one specific to the first data source and the second wrapper is configured to convert requests and responses between the common format and one specific to the second data source; and wherein each wrapper includes a mapping in the form of a computer readable data file automatically generated by the mapping generating system, and wherein the mapping generating computer system is configured to generate a computer readable data file representative of a mapping between a first representation of a set of concepts or of a data structure associated with the common format and a second representation of a set of concepts or of a data structure associated with a respective one of the first and second data sources, each representation comprising a plurality of complex representational elements which include a number of associated subordinate representational elements, the computer system including: a semantic similarity calculation module which is executable by the computer system for calculating a semantic similarity measure between a subordinate element of the first representation and each of the subordinate elements in the second representation; and a mapping generation module which is executable by the computer system for generating a mapping between the subordinate element of the first

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9773053B2 cover?
A system ( 100 ) for generating a computer readable data file representative of a mapping between a first representation of a set of concepts or of a data structure (e.g. a database schema) and a second representation of a set of concepts or of a data structure (e.g. an ontology), each representation comprising a plurality of complex representational elements (e.g. tables in a database schema a…
Who is the assignee on this patent?
Lee Beum Seuk, Cui Zhan, British Telecomm
What technology area does this patent fall under?
Primary CPC classification G06F17/30598. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 26 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).