Post-migration validation of ETL jobs and exception management

US10067993B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10067993-B2
Application numberUS-201615234038-A
CountryUS
Kind codeB2
Filing dateAug 11, 2016
Priority dateAug 6, 2013
Publication dateSep 4, 2018
Grant dateSep 4, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Handling extract-transform-load (ETL) job mismatches as “exceptions.” Exception handling may include the following steps: (i) determining a mismatch while running an extract-transform-load job with the mismatch being a mismatch of at least one of the following types: design time information mismatch, and/or operational metadata mismatch; and (ii) responsive to determining the mismatch, handling the mismatch as an exception.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: running a first instance of a first extract-transform-load (ETL) job on a first system, with the running of the first instance including generation of first job runtime operational metadata; running a second instance of the first ETL job on a second system, with the running of the second instance including generation of second job runtime operational metadata; responsive to the running of the first and second instances of the first ETL job, determining existence of a job runtime issue with the running of the second instance on the second system; responsive to the determination of existence of the job runtime issue of the second instance on the second system, creating an exception corresponding to the existence of the job runtime issue of the second instance on the second system; and handling the exception by correcting the runtime issue with respect to the second system based on intelligent analysis of design and runtime operational metadata, including: parsing a first runtime log generated by running the first instance and a second runtime log generated by running the second instance, determining information about the runtime issue based on the parsing of the first and second runtime logs, and finding out a root cause of the failure based upon the information determined about the runtime issue; wherein the root cause is one of the following: an environment variable failure, or an output folder failure. 2. The method of claim 1 wherein the correction of the runtime issue with respect to the second system comprises: finding a relevant fix is a fix repository; and applying the relevant fix. 3. The method of claim 1 further comprising: generating a new fix for correcting the runtime issue with respect to the second system based on the intelligent analysis of design and runtime operational metadata; and adding the new fix to a fix repository along with information identifying the runtime issue with respect to the second system that lead to generation of the new fix. 4. A computer program product comprising: a non-transitory computer-readable storage medium; and computer readable program instructions stored on the computer-readable storage medium; wherein the program instructions include: first program instructions programmed to run a first instance of a first extract-transform-load (ETL) job on a first system, with the running of the first instance including generation of first job runtime operational metadata; second program instructions programmed to run a second instance of the first ETL job on a second system, with the running of the second instance including generation of second job runtime operational metadata; third program instructions programmed to, responsive to the running of the first and second instances of the first ETL job, determine existence of a job runtime issue with the running of the second instance on the second system; fourth program instructions programmed to, responsive to the determination of existence of the job runtime issue of the second instance on the second system, creating an exception corresponding to the existence of the job runtime issue of the second instance on the second system; and fifth program instructions programmed to handle the exception by correct the runtime issue with respect to the second system based on intelligent analysis of design and runtime operational metadata, including: parsing a first runtime log generated by running the first instance and a second runtime log generated by running the second instance, determining information about the runtime issue based on the parsing of the first and second runtime logs, and finding out a root cause of the failure based upon the information determined about the runtime issue; wherein the root cause is one of the following: an environment variable failure, or an output folder failure. 5. The computer program product of claim 4 wherein the correction of the runtime issue with respect to the second system comprises: finding a relevant fix is a fix repository; and applying the relevant fix. 6. The computer program product of claim 4 further comprising: generating a new fix for correcting the runtime issue with respect to the second system based on the intelligent analysis of design and runtime operational metadata; and adding the new fix to a fix repository along with information identifying the runtime issue with respect to the second system that lead to generation of the new fix. 7. A computer system comprising: a set of processor(s); a non-transitory computer-readable storage medium; and computer readable program instructions executable on the set of processor(s) and stored on the computer-readable storage medium; wherein the program instructions include: first program instructions programmed to run a first instance of a first extract-transform-load (ETL) job on a first system, with the running of the first instance including generation of first job runtime operational metadata, second program instructions programmed to run a second instance of the first ETL job on a second system, with the running of the second instance including generation of second job runtime operational metadata, third program instructions programmed to, responsive to the running of the first and second instances of the first ETL job, determine existence of a job runtime issue with the running of the second instance on the second system, fourth program instructions programmed to, responsive to the determination of existence of the job runtime issue of the second instance on the second system, creating an exception corresponding to the existence of the job runtime issue of the second instance on the second system; and fifth program instructions programmed to handle the exception by correct the runtime issue with respect to the second system based on intelligent analysis of design and runtime operational metadata, including: parsing a first runtime log generated by running the first instance and a second runtime log generated by running the second instance, determining information about the runtime issue based on the parsing of the first and second runtime logs, and finding out a root cause of the failure based upon the information determined about the runtime issue; wherein the root cause is one of the following: an environment variable failure, or an output folder failure. 8. The computer system of claim 7 wherein the correction of the runtime issue with respect to the second system comprises: finding a relevant fix is a fix repository; and applying the relevant fix. 9. The computer system of claim 7 further comprising: generating a new fix for correcting the runtime issue with respect to the second system based on the intelligent analysis of design and runtime operational metadata; and adding the new fix to a fix repository along with information identifying the runtime issue with respect to the second system that lead to generation of the new fix.

Assignees

Inventors

Classifications

  • G06F16/254Primary

    Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title

  • Query processing · CPC title

  • Ensuring data consistency and integrity · CPC title

  • Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10067993B2 cover?
Handling extract-transform-load (ETL) job mismatches as “exceptions.” Exception handling may include the following steps: (i) determining a mismatch while running an extract-transform-load job with the mismatch being a mismatch of at least one of the following types: design time information mismatch, and/or operational metadata mismatch; and (ii) responsive to determining the mismatch, handling…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/254. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 04 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).