System for data management in a large scale data repository

US11409764B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11409764-B2
Application numberUS-202016895496-A
CountryUS
Kind codeB2
Filing dateJun 8, 2020
Priority dateSep 15, 2016
Publication dateAug 9, 2022
Grant dateAug 9, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method of managing data in a data repository is disclosed. The method comprises maintaining a data repository, the data repository storing data imported from one or more data sources. A database entity added to the data repository is identified and a metadata object for storing metadata relating to the database entity is created and stored in a metadata repository. The metadata object is also added to a documentation queue. Metadata for the metadata object is received from user via a metadata management user interface and the received metadata is stored in the metadata repository and associated with the metadata object.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising the steps of: maintaining a data repository, the data repository storing data imported from one or more data sources; storing and managing, by a metadata management module, metadata, in a metadata repository, relating to the data in the data repository; constructing, using a query builder interface, a first data query, using first metadata from the metadata repository, for querying the data in the data repository; storing the first data query in the metadata repository; creating a first metadata object for the first data query in the metadata repository, and adding the first metadata object to a documentation queue; constructing, using the query builder interface, a second data query, using second metadata from the metadata repository, for querying the data in the data repository; storing the second data query in the metadata repository; creating a second metadata object for the second data query in the metadata repository, and adding the second metadata object to the documentation queue; searching existing queries based on existing metadata objects stored in the metadata repository, the existing metadata objects including the first and second metadata objects; selecting one of the existing data queries based on its metadata object; editing the selected data query to construct, using the query builder interface, a third query using third metadata from the metadata repository, for querying the data in the data repository; storing the third data query in the metadata repository; and creating a third metadata object for the third data query in the metadata repository, and adding the third metadata object to the documentation queue, wherein the first, second and third metadata objects are stored in the metadata repository and are available for retrieval and execution or for retrieval and editing to create modified data queries. 2. A method according to claim 1 further comprising the steps of: in response to receiving a selection of two tables as data sources for the first data query, identifying based on the stored metadata at least one relationship between columns of the selected tables, and providing an indication of the identified at least one relationship on the query builder interface. 3. A method according to claim 2 , further comprising the step of displaying an indication of a relationship strength on the query builder interface. 4. A method according to claim 3 , wherein the identified relationships comprise one or more relationships automatically identified by analysis of data in the data repository, the relationship strength indication being computed during the analysis. 5. A method according to claim 1 , wherein the query builder interface comprises an interface for selection of data entities used in the first, second or third data query. 6. A method according to claim 1 , wherein the query builder interface enables selection of: one or more source tables for a data query; and one or more table relationships between selected tables; the method further comprising creating the first, second or third data query based on the selected tables and relationships. 7. A method according to claim 6 , wherein a selected relationship is used to define a table join on selected tables. 8. A method according to claim 1 , wherein the first, second and third metadata objects for the first, second and third data queries are available to other users of the data repository. 9. A method according to claim 1 , wherein the query builder interface comprises a data entity selection interface for selection of data entities used in the first, second or third data query, the data entity selection interface displaying the metadata for at least one data entity stored in the metadata repository. 10. A method according to claim 1 , further comprising at least one of the steps of: executing the first, second or third data query; transmitting the results of the first, second or third data query to a user device, storing the results of the first, second or third data query in the data repository, or transmitting the results of the first, second or third data query to a remote computer system. 11. A method according to claim 1 , wherein each of the first, second and third data queries comprises more than two tables and/or multiple join relationships and/or a nested query. 12. A method according to claim 1 , further comprising the steps of: after receiving the first metadata, adding the first metadata object to an approval queue; receiving a positive approval indication or a negative approval indication for the first metadata object from a second user via the metadata management user interface; and in response to a positive approval indication, marking the first metadata object as approved in the metadata repository. 13. A method according to claim 12 , comprising, in response to receiving the negative approval indication, adding the first metadata object to a dispute queue, the negative approval indication associated with dispute reason information entered by the second user. 14. A tangible non-transitory computer-readable medium comprising software code adapted, when executed on a data processing apparatus, to perform a method comprising the steps of: maintaining a data repository, the data repository storing data imported from one or more data sources; storing and manacling, by a metadata management module, metadata, in a metadata repository, relating to the data in the data repository; and constructing, using a query builder interface, a first data query, using first metadata from the metadata repository, for querying the data in the data repository; storing the first data query in the metadata repository; creating a first metadata object for the query in the metadata repository, and adding the first metadata object to a documentation queue; constructing, using the query builder interface, a second data query, using second metadata from the metadata repository, for querying the data in the data repository; storing the second data query in the metadata repository; creating a second metadata object for the second data query in the metadata repository, and adding the second metadata object to the documentation queue; searching existing queries based on existing metadata objects stored in the metadata repository, the existing metadata objects including the first and second metadata objects; selecting one of the existing data queries based on its metadata object; editing the selected data query to construct, using the query builder interface, a third query using third metadata from the metadata repository, for querying the data in the data repository; storing the third data query in the metadata repository; and creating a third metadata object for the third data query in the metadata repository, and adding the third metadata object to the documentation queue, wherein the first, second and third metadata objects are stored in the metadata repository and are available for retrieval and execution or for retrieval and editing to create modified data queries. 15. An apparatus comprising a processor with associated memory storing instructions arranged, when executed by the processor, to configure the processor to: maintain a data repository, the data repository storing data imported from one or more data sources; store and manage, by a metadata management module, metadata, in a metadata repository, relating to the data in the data repository; and construct, using a query builder interface, a first data query, using first metadata fro

Assignees

Inventors

Classifications

  • Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually · CPC title

  • using data annotations, e.g. user-defined metadata · CPC title

  • Interactive query statement specification based on a database schema · CPC title

  • G06F16/213Primary

    with details for schema evolution support · CPC title

  • G06F16/254Primary

    Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11409764B2 cover?
A computer-implemented method of managing data in a data repository is disclosed. The method comprises maintaining a data repository, the data repository storing data imported from one or more data sources. A database entity added to the data repository is identified and a metadata object for storing metadata relating to the database entity is created and stored in a metadata repository. The me…
Who is the assignee on this patent?
Hitachi Vantara Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/213. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 09 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).