Data processing method, data processing apparatus, and non-transitory computer-readable storage medium
US-2024320235-A1 · Sep 26, 2024 · US
US9519695B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9519695-B2 |
| Application number | US-201313908172-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 3, 2013 |
| Priority date | Apr 16, 2013 |
| Publication date | Dec 13, 2016 |
| Grant date | Dec 13, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system and computer-implemented method for automating data warehousing processes is provided. The system comprises a code generator configured to generate codes for Extract, Transform and Load (ETL) tools, wherein the codes facilitate the ETL tools in extracting, transforming and loading data read from data sources. The system further comprises a code reviewer configured to review and analyze the generated codes. Furthermore, the system comprises a data migration module configured to facilitate migrating the data read from the data sources to one or more data warehouses. Also, the system comprises a data generator configured to mask the data read from the data sources to generate processed data. In addition, the system comprises a Data Warehouse Quality Assurance module configured to facilitate testing the read and the processed data. The system further comprises a reporting module configured to provide status reports on the data warehousing processes.
Opening claim text (preview).
We claim: 1. A computer system for automating one or more data warehousing processes, the computer system comprising a processor and a memory, the computer system further comprising: a code generator connected to one or more Extract, Transform and Load (ETL) tools using adapters and connectors, the code generator is configured to generate, using the processor, codes that facilitate the one or more ETL tools to extract, transform and load data read by a data acquisition module from one or more data sources, wherein the code generator is connected to the one or more Extract, Transform and Load (ETL) tools based on one or more connection parameters received from one or more users; a code reviewer configured to review and analyze, using the processor, the generated codes, wherein the review and analysis of the generated codes comprises identifying unused variables in the generated codes that clog the memory; a data migration module configured to facilitate, using the processor, migrating the data read from the one or more data sources to one or more data warehouses, wherein the reviewed generate codes facilitate in migrating the read data to the one or more data warehouses; a data generator configured to mask, using the processor, the data read from the one or more data sources to generate processed data for testing; a Data Warehouse Quality Assurance (DW QA) module configured to facilitate, using the processor, testing at least one of: the read data and the processed data; and a reporting module configured, using the processor, to provide one or more status reports on the one or more data warehousing processes, wherein a dictionary matcher facilitates the reporting module in generating a report on outliers in the read data by identifying and highlighting outliers in the read data. 2. The computer system of claim 1 , wherein the system is connected with one or more external systems and further wherein the one or more external systems comprise at least one of: the data acquisition module, the one or more data sources, and one or more On-Line Analytical Processing (OLAP) tools. 3. The computer system of claim 1 further comprising a business requirement module configured to provide, using the processor, an interface to facilitate one or more users to provide and manage one or more business requirements. 4. The computer system of claim 1 further comprising a data profiler configured to integrate, using the processor, the read data, wherein integrating the read data comprises at least one of: identifying redundancies in the read data, profiling the read data and checking quality of the read data. 5. The computer system of claim 1 , wherein the generated codes comprise at least one of: automatically generated codes and manually developed codes. 6. The computer system of claim 1 , wherein the DW QA module comprises: a Total Automated Software Quality (TASQ) Engine configured to validate the read data and the processed data; a data comparator configured to compare the read data with the data in the one or more data sources to identify errors; and a data reconciler configured to validate the migrated data stored in the one or more data warehouses. 7. The computer system of claim 1 further comprising: a metadata manager configured to: read, using the processor, metadata across at least one of: the system, the one or more data warehouses and one or more external systems; provide, using the processor, a unified view of the read metadata to facilitate data lineage; and store, using the processor, the read metadata in a metadata repository. 8. The computer system of claim 1 further comprising an impact analyzer configured to perform, using the processor, impact analysis across at least one of: the system, the one or more data warehouses and one or more external systems. 9. The computer system of claim 1 further comprising a script generator configured to generate, using the processor, scripts for loading the read data from the one or more data sources into the one or more data warehouses. 10. The computer system of claim 1 further comprising a data compression module configured to generate, using the processor, one or more scripts for compressing at least one of: one or more tables and one or more data structures of the one or more data warehouses. 11. The computer system of claim 1 further comprising an On-Line Analytical Processing (OLAP) companion module configured to analyze, using the processor, metadata of one or more reports generated by one or more OLAP tools. 12. The computer system of claim 1 , wherein the one or more status reports include at least one of: a status report for Data Definition Language (DDL) replication, a status report for data validation, a status report for data replication, a drill down report and test cases execution report. 13. A computer-implemented method for automating one or more data warehousing processes, via program instructions stored in a memory and executed by a processor, the computer-implemented method comprising: generating codes that facilitate one or more Extract, Transform and Load (ETL) tools to extract, transform, and load data read from one or more data sources, wherein the codes are generated by a code generator connected to the one or more ETL tools using adapters and connectors, wherein the code generator is connected to the one or more Extract, Transform and Load (ETL) tools based on one or more connection parameters received from one or more users; reviewing and analyzing the generated codes, wherein the review and analysis of the generated codes comprises identifying unused variables in the generated codes that clog the memory; migrating the data read from the one or more data sources to one or more data warehouses, wherein the reviewed generate codes facilitate in migrating the read data to the one or more data warehouses; masking the data read from the one or more data sources to generate processed data for testing; testing at least one of: the read data and the processed data; and providing one or more status reports on the one or more data warehousing processes, wherein a dictionary matcher facilitates the reporting module in generating a report on outliers in the read data by identifying and highlighting outliers in the read data. 14. The computer-implemented method of claim 13 further comprising the step of facilitating one or more users to provide one or more business requirements. 15. The computer-implemented method of claim 13 , wherein the generated codes comprise at least one of: automatically generated codes and manually developed codes. 16. The computer-implemented method of claim 13 , wherein the step of testing at least one of: the read data and the processed data comprises: validating the read data and the processed data; comparing the read data with the data in the one or more data sources to identify one or more errors; and validating the migrated data stored in the one or more data warehouses. 17. The computer-implemented method of claim 13 , wherein the one or more status reports include at least one of: a status report for Data Definition Language (DDL) replication, a status report for data validation, a status report for data replication, a drill down report and test cases execution report. 18. A computer program product for automating one or more data warehousing processes, the computer program product comprising: a non-transitory computer-readable medium having computer-readable program code stored thereon, the computer-readable program code comprising instructions that
Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.