System and method for automating data warehousing processes

US9519695B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9519695-B2
Application numberUS-201313908172-A
CountryUS
Kind codeB2
Filing dateJun 3, 2013
Priority dateApr 16, 2013
Publication dateDec 13, 2016
Grant dateDec 13, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and computer-implemented method for automating data warehousing processes is provided. The system comprises a code generator configured to generate codes for Extract, Transform and Load (ETL) tools, wherein the codes facilitate the ETL tools in extracting, transforming and loading data read from data sources. The system further comprises a code reviewer configured to review and analyze the generated codes. Furthermore, the system comprises a data migration module configured to facilitate migrating the data read from the data sources to one or more data warehouses. Also, the system comprises a data generator configured to mask the data read from the data sources to generate processed data. In addition, the system comprises a Data Warehouse Quality Assurance module configured to facilitate testing the read and the processed data. The system further comprises a reporting module configured to provide status reports on the data warehousing processes.

First claim

Opening claim text (preview).

We claim: 1. A computer system for automating one or more data warehousing processes, the computer system comprising a processor and a memory, the computer system further comprising: a code generator connected to one or more Extract, Transform and Load (ETL) tools using adapters and connectors, the code generator is configured to generate, using the processor, codes that facilitate the one or more ETL tools to extract, transform and load data read by a data acquisition module from one or more data sources, wherein the code generator is connected to the one or more Extract, Transform and Load (ETL) tools based on one or more connection parameters received from one or more users; a code reviewer configured to review and analyze, using the processor, the generated codes, wherein the review and analysis of the generated codes comprises identifying unused variables in the generated codes that clog the memory; a data migration module configured to facilitate, using the processor, migrating the data read from the one or more data sources to one or more data warehouses, wherein the reviewed generate codes facilitate in migrating the read data to the one or more data warehouses; a data generator configured to mask, using the processor, the data read from the one or more data sources to generate processed data for testing; a Data Warehouse Quality Assurance (DW QA) module configured to facilitate, using the processor, testing at least one of: the read data and the processed data; and a reporting module configured, using the processor, to provide one or more status reports on the one or more data warehousing processes, wherein a dictionary matcher facilitates the reporting module in generating a report on outliers in the read data by identifying and highlighting outliers in the read data. 2. The computer system of claim 1 , wherein the system is connected with one or more external systems and further wherein the one or more external systems comprise at least one of: the data acquisition module, the one or more data sources, and one or more On-Line Analytical Processing (OLAP) tools. 3. The computer system of claim 1 further comprising a business requirement module configured to provide, using the processor, an interface to facilitate one or more users to provide and manage one or more business requirements. 4. The computer system of claim 1 further comprising a data profiler configured to integrate, using the processor, the read data, wherein integrating the read data comprises at least one of: identifying redundancies in the read data, profiling the read data and checking quality of the read data. 5. The computer system of claim 1 , wherein the generated codes comprise at least one of: automatically generated codes and manually developed codes. 6. The computer system of claim 1 , wherein the DW QA module comprises: a Total Automated Software Quality (TASQ) Engine configured to validate the read data and the processed data; a data comparator configured to compare the read data with the data in the one or more data sources to identify errors; and a data reconciler configured to validate the migrated data stored in the one or more data warehouses. 7. The computer system of claim 1 further comprising: a metadata manager configured to: read, using the processor, metadata across at least one of: the system, the one or more data warehouses and one or more external systems; provide, using the processor, a unified view of the read metadata to facilitate data lineage; and store, using the processor, the read metadata in a metadata repository. 8. The computer system of claim 1 further comprising an impact analyzer configured to perform, using the processor, impact analysis across at least one of: the system, the one or more data warehouses and one or more external systems. 9. The computer system of claim 1 further comprising a script generator configured to generate, using the processor, scripts for loading the read data from the one or more data sources into the one or more data warehouses. 10. The computer system of claim 1 further comprising a data compression module configured to generate, using the processor, one or more scripts for compressing at least one of: one or more tables and one or more data structures of the one or more data warehouses. 11. The computer system of claim 1 further comprising an On-Line Analytical Processing (OLAP) companion module configured to analyze, using the processor, metadata of one or more reports generated by one or more OLAP tools. 12. The computer system of claim 1 , wherein the one or more status reports include at least one of: a status report for Data Definition Language (DDL) replication, a status report for data validation, a status report for data replication, a drill down report and test cases execution report. 13. A computer-implemented method for automating one or more data warehousing processes, via program instructions stored in a memory and executed by a processor, the computer-implemented method comprising: generating codes that facilitate one or more Extract, Transform and Load (ETL) tools to extract, transform, and load data read from one or more data sources, wherein the codes are generated by a code generator connected to the one or more ETL tools using adapters and connectors, wherein the code generator is connected to the one or more Extract, Transform and Load (ETL) tools based on one or more connection parameters received from one or more users; reviewing and analyzing the generated codes, wherein the review and analysis of the generated codes comprises identifying unused variables in the generated codes that clog the memory; migrating the data read from the one or more data sources to one or more data warehouses, wherein the reviewed generate codes facilitate in migrating the read data to the one or more data warehouses; masking the data read from the one or more data sources to generate processed data for testing; testing at least one of: the read data and the processed data; and providing one or more status reports on the one or more data warehousing processes, wherein a dictionary matcher facilitates the reporting module in generating a report on outliers in the read data by identifying and highlighting outliers in the read data. 14. The computer-implemented method of claim 13 further comprising the step of facilitating one or more users to provide one or more business requirements. 15. The computer-implemented method of claim 13 , wherein the generated codes comprise at least one of: automatically generated codes and manually developed codes. 16. The computer-implemented method of claim 13 , wherein the step of testing at least one of: the read data and the processed data comprises: validating the read data and the processed data; comparing the read data with the data in the one or more data sources to identify one or more errors; and validating the migrated data stored in the one or more data warehouses. 17. The computer-implemented method of claim 13 , wherein the one or more status reports include at least one of: a status report for Data Definition Language (DDL) replication, a status report for data validation, a status report for data replication, a drill down report and test cases execution report. 18. A computer program product for automating one or more data warehousing processes, the computer program product comprising: a non-transitory computer-readable medium having computer-readable program code stored thereon, the computer-readable program code comprising instructions that

Assignees

Inventors

Classifications

  • G06F16/254Primary

    Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9519695B2 cover?
A system and computer-implemented method for automating data warehousing processes is provided. The system comprises a code generator configured to generate codes for Extract, Transform and Load (ETL) tools, wherein the codes facilitate the ETL tools in extracting, transforming and loading data read from data sources. The system further comprises a code reviewer configured to review and analyze…
Who is the assignee on this patent?
Cognizant Tech Solutions India Pvt Ltd
What technology area does this patent fall under?
Primary CPC classification G06F16/254. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 13 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).