Snapshots to train prediction models and improve workflow execution
US-10909503-B1 · Feb 2, 2021 · US
US11119879B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11119879-B2 |
| Application number | US-201816038373-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 18, 2018 |
| Priority date | Jul 18, 2018 |
| Publication date | Sep 14, 2021 |
| Grant date | Sep 14, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are provided for detecting resource bottlenecks in workflow task executions using provenance data. An exemplary method comprises: obtaining a state of multiple workflow executions of multiple concurrent workflows performed with different resource allocation configurations in a shared infrastructure environment; obtaining first and second signature execution traces of a task representing first and second resource allocation configurations, respectively; identifying first and second corresponding sequences of time intervals in the first and second signature execution traces for the task, respectively, based on a similarity metric; and identifying a given time interval as a resource bottleneck of a resource that differs between the first and second resource allocation configurations based on a change in execution time for the given time interval between the first and second signature execution traces. The first signature execution trace may be obtained by disaggregating data related to batches of workflow executions.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: obtaining a state of multiple workflow executions of a plurality of concurrent workflows in a shared infrastructure environment, wherein said multiple workflow executions are performed with a plurality of different resource allocation configurations, wherein said state comprises provenance data of said multiple workflow executions and wherein each of said multiple workflow executions is comprised of one or more tasks; obtaining a first signature execution trace of at least one task within the plurality of concurrent workflows representing a first resource allocation configuration, and a second signature execution trace of said at least one task within the plurality of concurrent workflows representing a second resource allocation configuration; identifying, using at least one processing device, a first sequence of time intervals in said first signature execution trace for said at least one task that corresponds to a second sequence of time intervals in said second signature execution trace for said at least one task based on a similarity metric; and identifying, using the at least one processing device, a given time interval in said first and second corresponding sequences of time intervals for said at least one task as a resource bottleneck of one or more resources that differ between said first resource allocation configuration and said second resource allocation configuration based on a change in execution time for the given time interval between the first signature execution trace and the second signature execution trace. 2. The method of claim 1 , wherein the resource bottleneck identifies said at least one task within the plurality of concurrent workflows as responsive to changes in the resource allocation of the corresponding resources. 3. The method of claim 2 , wherein the allocation of said corresponding resources in a new execution of a plurality of workflows comprised of said at least one task is adjusted to substantially minimize said resource bottleneck. 4. The method of claim 2 , wherein a new execution of a plurality of workflows comprised of said at least one task is substantially optimized by adjusting said corresponding resources in order to substantially minimize said resource bottleneck. 5. The method of claim 1 , wherein said step of obtaining said first signature execution trace comprises disaggregating data related to batches of executions of the plurality of concurrent workflows. 6. The method of claim 1 , wherein said step of identifying said first and second sequences of time intervals comprises aligning said first sequence of time intervals and said second sequence of time intervals based on telemetry data information in said first signature execution trace and second signature execution trace. 7. The method of claim 1 , wherein step of said identifying said first and second sequences of time intervals comprises substantially maximizing the similarity metric. 8. A system, comprising: a memory; and at least one processing device, coupled to the memory, operative to implement the following steps: obtaining a state of multiple workflow executions of a plurality of concurrent workflows in a shared infrastructure environment, wherein said multiple workflow executions are performed with a plurality of different resource allocation configurations, wherein said state comprises provenance data of said multiple workflow executions and wherein each of said multiple workflow executions is comprised of one or more tasks; obtaining a first signature execution trace of at least one task within the plurality of concurrent workflows representing a first resource allocation configuration, and a second signature execution trace of said at least one task within the plurality of concurrent workflows representing a second resource allocation configuration; identifying a first sequence of time intervals in said first signature execution trace for said at least one task that corresponds to a second sequence of time intervals in said second signature execution trace for said at least one task based on a similarity metric; and identifying a given time interval in said first and second corresponding sequences of time intervals for said at least one task as a resource bottleneck of one or more resources that differ between said first resource allocation configuration and said second resource allocation configuration based on a change in execution time for the given time interval between the first signature execution trace and the second signature execution trace. 9. The system of claim 8 , wherein the resource bottleneck identifies said at least one task within the plurality of concurrent workflows as responsive to changes in the resource allocation of the corresponding resources. 10. The system of claim 9 , wherein the allocation of said corresponding resources in a new execution of a plurality of workflows comprised of said at least one task is adjusted to substantially minimize said resource bottleneck. 11. The system of claim 9 , wherein a new execution of a plurality of workflows comprised of said at least one task is substantially optimized by adjusting said corresponding resources in order to substantially minimize said resource bottleneck. 12. The system of claim 8 , wherein said step of obtaining said first signature execution trace comprises disaggregating data related to batches of executions of the plurality of concurrent workflows. 13. The system of claim 8 , wherein said step of identifying said first and second sequences of time intervals comprises aligning said first sequence of time intervals and said second sequence of time intervals based on telemetry data information in said first signature execution trace and second signature execution trace. 14. The system of claim 8 , wherein step of said identifying said first and second sequences of time intervals comprises substantially maximizing the similarity metric. 15. A computer program product, comprising a tangible machine-readable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed by at least one processing device perform the following steps: obtaining a state of multiple workflow executions of a plurality of concurrent workflows in a shared infrastructure environment, wherein said multiple workflow executions are performed with a plurality of different resource allocation configurations, wherein said state comprises provenance data of said multiple workflow executions and wherein each of said multiple workflow executions is comprised of one or more tasks; obtaining a first signature execution trace of at least one task within the plurality of concurrent workflows representing a first resource allocation configuration, and a second signature execution trace of said at least one task within the plurality of concurrent workflows representing a second resource allocation configuration; identifying a first sequence of time intervals in said first signature execution trace for said at least one task that corresponds to a second sequence of time intervals in said second signature execution trace for said at least one task based on a similarity metric; and identifying a given time interval in said first and second corresponding sequences of time intervals for said at least one task as a resource bottleneck of one or more resources that differ between said first resource allocation configuration and said second resource allocation configuration based on a change in execution time for the given time interval between the first
by assessing time · CPC title
where the assessed time is active or idle time · CPC title
Performance evaluation by tracing or monitoring · CPC title
considering the load · CPC title
Timestamp · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.