Data warehouse data model adapters

US9542469B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9542469-B2
Application numberUS-86836310-A
CountryUS
Kind codeB2
Filing dateAug 25, 2010
Priority dateAug 25, 2010
Publication dateJan 10, 2017
Grant dateJan 10, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In the context of data administration in enterprises, an effective manner of providing a central data warehouse, particularly via employing a tool that helps by analyzing existing data and reports from different business units. In accordance with at least one embodiment of the invention, such a tool analyzes the data model of an enterprise and proposes alternatives for building a new data warehouse. The tool, in accordance with at least one embodiment of the invention, models the problem of identifying fact/dimension attributes of a warehouse model as a graph cut on a Dependency Analysis Graph (DAG). The DAG is built using existing data models and the report generation scripts. The tool also uses the DAG for generation of ETL (Extract, Transform Load) scripts that can be used to populate the newly proposed data warehouse from data present in the existing schemas.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising utilizing at least one processor to execute computer code that performs the steps of: analyzing base table scripts which generate a plurality of reports from preexisting base tables; and developing a schema for a new data warehouse with merged data from the preexisting base tables and the schema configured for singly generating reports relating to the merged data, said developing comprising: forming a fact table in the new data warehouse, wherein the forming a fact table comprises identifying a set of fact attributes, from the plurality of reports generated, wherein the set of fact attributes comprises attributes on which an aggregate operation is defined and attributes referenced in at least one of the plurality of reports generated; forming dimensions in the new data warehouse, wherein the forming dimensions comprises forming a candidate table set by identifying a set of preexisting base tables which include at least one candidate attribute, wherein the at least one candidate attribute comprises an attribute that is contained within a group-by clause within the plurality of reports; and generating warehouse scripts for populating the formed fact table and the formed dimensions for the new data warehouse with data from the preexisting base tables. 2. The method according to claim 1 , further comprising ensuring adherence to predetermined design criteria. 3. The method according to claim 1 , further comprising merging like attributes from different base tables. 4. The method according to claim 1 , wherein said generating comprises referring to base tables. 5. The method according to claim 1 , wherein the forming a fact table comprises scanning a report-generating query to identify attributes on which the aggregate operation is defined. 6. The method according to claim 5 , wherein said scanning comprises identifying a direct projection attribute and an indirect projection attribute. 7. The method according to claim 5 , wherein said scanning comprises employing a dependency analysis graph which represents the report-generating query. 8. The method according to claim 1 , wherein the forming a fact table comprises ascertaining a need for multiple fact tables in the new data warehouse. 9. The method according to claim 1 , wherein the forming dimensions comprises identifying a set of all attributes which are used in a group-by clause of a report-generating query. 10. The method according to claim 1 , wherein the forming dimensions comprises determining candidate dimension tables and ascertaining a potential hierarchy among the candidate dimension tables. 11. The method according to claim 1 , wherein said generating comprises determining a granularity for the fact table in the new data warehouse. 12. An apparatus comprising: one or more hardware processors; and a computer readable storage medium having computer readable program code embodied therewith and executable by the one or more hardware processors, the computer readable program code comprising: computer readable program code configured to analyze scripts which generate a plurality of reports from preexisting base tables; and computer readable program code configured to develop a schema for a new data warehouse with merged data from the preexisting base tables and the schema configured for singly generating reports relating to the merged data, via: forming a fact table in the new data warehouse, wherein the forming a fact table comprises identifying a set of fact attributes, from the plurality of reports generated, wherein the set of fact attributes comprises attributes on which an aggregate operation is defined and attributes referenced in at least one of the plurality of reports generated; forming dimensions in the new data warehouse, wherein the forming dimensions comprises forming a candidate table set by identifying a set of preexisting base tables which include at least one candidate attribute, wherein the at least one candidate attribute comprises an attribute that is contained within a group-by clause within the plurality of reports; and generating scripts for populating the formed fact table and the formed dimensions for the new data warehouse with data from the preexisting base tables. 13. A computer program product comprising: a non-transitory computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising: computer readable program code configured to analyze scripts which generate a plurality of reports from preexisting base tables; and computer readable program code configured to develop a schema for a new data warehouse with merged data from the preexisting base tables and the schema configured for singly generating reports relating to the merged data, via: forming a fact table in the new data warehouse, wherein the forming a fact table comprises identifying a set of fact attributes, from the plurality of reports generated, wherein the set of fact attributes comprises attributes on which an aggregate operation is defined and attributes referenced in at least one of the plurality of reports generated; forming dimensions in the new data warehouse, wherein the forming dimensions comprises forming a candidate table set by identifying a set of preexisting base tables which include at least one candidate attribute, wherein the at least one candidate attribute comprises an attribute that is contained within a group-by clause within the plurality of reports; and generating scripts for populating the formed fact table and the formed dimensions for the new data warehouse with data from the preexisting base tables. 14. The computer program product according to claim 13 , wherein said computer readable program code is further configured to ensure adherence to predetermined design criteria. 15. The computer program product according to claim 13 , wherein said computer readable program code is further configured to merge like attributes from different base tables. 16. The computer program product according to claim 13 , wherein said computer readable program code is configured to refer to base tables in generating scripts for populating the new data warehouse. 17. The computer program product according to claim 13 , wherein said computer readable program code is configured to scan a report-generating query to identify attributes on which the aggregate operation is defined. 18. The computer program product according to claim 17 , wherein said computer readable program code is configured to identify a direct projection attribute and an indirect projection attribute. 19. The computer program product according to claim 17 , wherein said computer readable program code is configured to employ a dependency analysis graph which represents the report-generating query. 20. The computer program product according to claim 13 , wherein said computer readable program code is configured to ascertain a need for multiple fact tables in the new data warehouse. 21. The computer program product according to claim 13 , wherein said computer readable program code is configured to identify a set of all attributes which are used in a group-by clause of a report-generating query. 22. The computer program product according to claim 13 , wherein said computer readable program code is configured to determine candidate dimension tables and ascertain a potential hierarchy among the candidate dimension tables.

Assignees

Inventors

Classifications

  • G06F16/283Primary

    Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9542469B2 cover?
In the context of data administration in enterprises, an effective manner of providing a central data warehouse, particularly via employing a tool that helps by analyzing existing data and reports from different business units. In accordance with at least one embodiment of the invention, such a tool analyzes the data model of an enterprise and proposes alternatives for building a new data wareh…
Who is the assignee on this patent?
Batra Vishal Singh, Bhide Manish Anand, Mohania Mukesh Kumar, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06F16/283. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 10 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).