Dynamic optimization of workload execution based on statistical data collection and updated job profiling

US9996389B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9996389-B2
Application numberUS-201414205080-A
CountryUS
Kind codeB2
Filing dateMar 11, 2014
Priority dateMar 11, 2014
Publication dateJun 12, 2018
Grant dateJun 12, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments presented herein provide techniques for optimizing parallel data flows of a batch processing job using a profile of the processing job. An application retrieves a job profile for a processing job. The processing job has a plurality of processing stages specified in an execution profile. The job profile includes statistical data for at least one of the processing stages obtained during prior executions of the job. The application modifies properties of the execution profile based on the job profile to optimize the execution of the job. The application executes the processing job with the modified execution profile.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a processor; and a memory storing a program, which, when executed on the processor, performs an operation, the operation comprising: retrieving, via a processor, a job profile for a processing job, wherein the processing job has a plurality of processing stages specified in an execution profile, wherein the execution profile specifies a respective set of one or more parameters for each of one or more of the processing stages of the processing job, and wherein the job profile includes statistical data for at least one of the processing stages obtained during prior executions of the processing job, wherein the statistical data includes an amount of data that was spilled to a disk during at least one of the one or more processing stages, and wherein the statistical data was obtained by, for each of the processing stages during at least a first of the prior executions: identifying a plurality of operations in the processing stage, and during performance of each of the operations: determining whether a flag is set for the operation, wherein the flag indicates to gather statistical data relating to the operation, and upon determining that a flag is set, collecting the statistical data relating to the operation, identifying a plurality of optimizations to apply to the execution profile based on the statistical data and based on an indication that at least a first one of the processing stages of the processing job was selected for optimization during one of the prior executions, receiving, via a user interface, a selection of at least a first optimization of the identified optimizations to apply to the execution profile, wherein the first optimization comprises a selection of a compression algorithm; modifying the respective set of one or more parameters of the execution profile for the at least the first one of the processing stages of the processing job by applying the selected first optimization to the at least the first one of the processing stages to optimize an execution environment used to execute the processing job, and executing the processing job in the optimized execution environment based on the modified execution profile. 2. The system of claim 1 , wherein the statistical data includes at least one of a total data size processed in the processing stages, an average record size processed by data transfer blocks, and an amount of system resource usage for each processing stage. 3. The system of claim 1 , wherein the job profile was generated during a prior execution of the processing job by monitoring the processing stages during the prior execution. 4. The system of claim 1 , wherein the job profile comprises a plurality of job profiles generated during a corresponding plurality of prior executions of the processing job. 5. The system of claim 1 , wherein the processing job is an extract, transform, and load job. 6. The system of claim 1 , wherein the modified respective set of one or more parameters includes at least one of a transfer block size, table chaining, and a record layout specified in the execution profile. 7. A computer program product, comprising: a non-transitory computer-readable storage medium having computer-readable program code embodied therewith, the computer-readable program code configured to perform an operation, the operation comprising: retrieving, via a processor, a job profile for a processing job, wherein the processing job has a plurality of processing stages specified in an execution profile, wherein the execution profile specifies a respective set of one or more parameters for each of one or more of the processing stages of the processing job, and wherein the job profile includes statistical data for at least one of the processing stages obtained during a prior execution of the processing job, wherein the statistical data includes an amount of data that was spilled to a disk during at least one of the one or processing stages, and wherein the statistical data was obtained by, for each of the processing stages during at least a first of the prior executions: identifying a plurality of operations in the processing stage, and during performance of each of the operations: determining whether a flag is set for the operation, wherein the flag indicates to gather statistical data relating to the operation, and upon determining that a flag is set, collecting the statistical data relating to the operation; identifying a plurality of optimizations to apply to the execution profile based on the statistical data and based on an indication that at least a first one of the processing stages of the processing job was selected for optimization during one of the prior executions, receiving, via a user interface, a selection of at least a first optimization of the identified optimizations to apply to the execution profile, wherein the first optimization comprises a selection of a compression algorithm; modifying the respective set of one or more parameters of the execution profile for the at least the first one of the processing stages of the processing job by applying the selected first optimization to the at least the first one of the processing stages to optimize an execution environment used to execute the processing job; and executing the processing job in the optimized execution environment based on the modified execution profile. 8. The computer program product of claim 7 , wherein the statistical data includes at least one of a total data size processed in the processing stages, an average record size processed by data transfer blocks, and an amount of system resource usage for each processing stage. 9. The computer program product of claim 7 , wherein the job profile was generated during a prior execution of the processing job by monitoring the processing stages during the prior execution. 10. The computer program product of claim 7 , wherein the operation further comprises, receiving a specification of the parameters of the execution profile which are available to modify based on the job profile. 11. The computer program product of claim 7 , wherein the job profile comprises a plurality of job profiles generated during a corresponding plurality of prior executions of the processing job. 12. The computer program product of claim 7 , wherein the processing job is an extract, transform, and load job. 13. The computer program product of claim 7 , wherein the modified respective set of one or more parameters includes at least one of a transfer block size, table chaining, and a record layout specified in the execution profile.

Assignees

Inventors

Classifications

  • Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation {; Recording or statistical evaluation of user activity, e.g. usability assessment} · CPC title

  • G06F9/4843Primary

    by program, e.g. task dispatcher, supervisor, operating system · CPC title

  • Allocation of resources, e.g. of the central processing unit [CPU] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9996389B2 cover?
Embodiments presented herein provide techniques for optimizing parallel data flows of a batch processing job using a profile of the processing job. An application retrieves a job profile for a processing job. The processing job has a plurality of processing stages specified in an execution profile. The job profile includes statistical data for at least one of the processing stages obtained duri…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F9/4843. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 12 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).