Performance checking component for an ETL job

US9710530B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9710530-B2
Application numberUS-201615234139-A
CountryUS
Kind codeB2
Filing dateAug 11, 2016
Priority dateMay 30, 2014
Publication dateJul 18, 2017
Grant dateJul 18, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Generation of a performance determination report for an Extract, Transform, Load (ETL) job includes decomposing the ETL job into two or more stage instances, and identifying one or more conditions for each of the stage instances. A set of tests for each of the identified conditions are generated. A first set of test results are generated by performing the set of tests. It is determined whether a test result from the first set of test results is outside of a first range. Conditions that can be identified include a non-volatile free memory condition, a network reliability condition, a network configuration condition, an application availability condition, a database availability condition, a database performance condition, a schema validity condition, an installed libraries condition, a configuration parameter condition, a volatile free memory condition, and a third party tool condition.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method performed by a processor, comprising: decomposing an Extract, Transform, and Load (ETL) job into two or more stage instances, the two or more stage instances including a first extraction stage instance and a second extraction stage instance, a first data validation stage instance and a second data validation stage instance, a transform stage instance, and a transfer stage instance; identifying one or more conditions for each of the stage instances, the one or more conditions including a network reliability condition for the first and second extraction stage instances and for the transfer stage instance, the one or more conditions including a database performance condition for the first and second data validation stage instances, and the one or more conditions including a third party tool condition for the transform stage instance; generating a set of tests for each of the one or more conditions, wherein each set of tests includes one or more tests for a corresponding condition, wherein the set of tests for the network reliability condition includes a ping test, the set of tests for the database performance condition includes a database query test, and the set of tests for the third party tool condition includes a test to validate transformed data values; extracting data from a first source in the first extraction stage instance and extracting data from a second source in the second extraction stage instance; validating data values for the first source in the first data validation stage instance and validating data values from the second source in the second data validation stage instance; transforming the data from the first source and second source in the transform stage instance, generating a first joined data; transferring the first joined data to one or more target databases in the transfer stage instance; generating a first test result for the ETL job by performing a first set of tests during a runtime phase of the ETL job; saving the first test result to an archive, the archive including data of a second test result and one or more resolutions for an underperforming condition, the second test result being generated by performing the first set of tests on a prior ETL job; determining a first range for the first test result by calculating an average and a standard deviation from two or more historical test results, the two or more historical test results including the second test result, and wherein the first range is one standard deviation from the average; determining that the first test result from the first set of test results is outside of the first range by comparing the first test result with the first range; generating a performance determination report, the performance determination report including the first test result; and providing the performance determination report to a user.

Assignees

Inventors

Classifications

  • where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems (multiprogramming arrangements G06F9/46; allocation of resources G06F9/50) · CPC title

  • for performance assessment · CPC title

  • Physics · mapped topic

  • Logging of test results · CPC title

  • where the computing system component is a software system · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9710530B2 cover?
Generation of a performance determination report for an Extract, Transform, Load (ETL) job includes decomposing the ETL job into two or more stage instances, and identifying one or more conditions for each of the stage instances. A set of tests for each of the identified conditions are generated. A first set of test results are generated by performing the set of tests. It is determined whether …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F17/30563. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).