Workflow simulation using provenance data similarity and sequence alignment

US11263369B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11263369-B2
Application numberUS-201816023116-A
CountryUS
Kind codeB2
Filing dateJun 29, 2018
Priority dateJun 29, 2018
Publication dateMar 1, 2022
Grant dateMar 1, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are provided for workflow simulation using provenance data similarity and sequence alignment. An exemplary method comprises: obtaining a state of workflow executions of concurrent workflows with multiple resource allocation configurations, wherein the state comprises provenance data of the concurrent workflows; obtaining execution traces of the concurrent workflows representing different resource allocation configurations; identifying a set of states in a first execution trace and a set of states in a second execution trace as corresponding anchor states; mapping a first intermediate state to a second intermediate state between a pair of anchor states using the provenance data; generating a simulation model of the workflow executions representing the different configurations of the resource allocation; and generating new simulation traces of the workflow executions with resource allocation configurations that are not represented in the provenance data.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: obtaining a state of one or more workflow executions of a plurality of concurrent workflows in a shared infrastructure environment with a plurality of different resource allocation configurations, wherein said state comprises provenance data of said concurrent workflows; obtaining a first execution trace of the concurrent workflows representing a first resource allocation configuration, and a second execution trace of the concurrent workflows representing a second resource allocation configuration; identifying a first set of first states in said first execution trace and a second set of second states in said second execution trace as corresponding anchor states; mapping, using at least one processing device, a first intermediate state between a pair of said anchor states in said first execution trace to a second intermediate state between a pair of said anchor states in said second execution trace using the provenance data such that said second intermediate state is between the anchor states in said second execution trace that correspond to said pair of anchor states in the first execution trace, wherein the provenance data additionally comprises telemetry data, and wherein the mapping is further based on a predefined similarity metric with respect to a consumption of one or more resources by the first intermediate state and the second intermediate state; generating a simulation model of said one or more workflow executions representing a plurality of the different configurations of the resource allocation; and generating, using the at least one processing device, one or more new simulation traces of said one or more workflow executions with one or more resource allocation configurations that are not represented in the provenance data. 2. The method of claim 1 , wherein the first intermediate state and the second intermediate state are between two anchor states that satisfy predefined task completion criteria and the mapping of the first intermediate state to the second intermediate state is based on a linear mapping between the two anchor states. 3. The method of claim 1 , wherein the mapping is based on a completion of tasks, obtained from an analysis of the provenance data, of the plurality of concurrent workflows. 4. The method of claim 1 , wherein the provenance data indicates a number of substantially completed tasks of each type for each of the plurality of concurrent workflows, and wherein the mapping is based on a partial completion of tasks, obtained from an analysis of the provenance data, of the plurality of concurrent workflows based on a task completion cost function. 5. The method of claim 1 , wherein the consumption of resources by the first intermediate state and the second intermediate state correlate with a type of work being executed for the first state and the second state, and wherein the mapping further comprises the step of aligning the telemetry data for two different workflow executions. 6. The method of claim 1 , wherein said mapping is performed for a first plurality of said intermediate states in said first execution trace to a second plurality of said intermediate states in said second execution trace such that ordering relations between states in said first plurality of intermediate states is preserved in said second plurality of intermediate states. 7. The method of claim 1 , further comprising adjusting an allocation of at least one resource for an execution of said plurality of concurrent workflows based on said mapping. 8. A system, comprising: a memory; and at least one processing device, coupled to the memory, operative to implement the following steps: obtaining a state of one or more workflow executions of a plurality of concurrent workflows in a shared infrastructure environment with a plurality of different resource allocation configurations, wherein said state comprises provenance data of said concurrent workflows; obtaining a first execution trace of the concurrent workflows representing a first resource allocation configuration, and a second execution trace of the concurrent workflows representing a second resource allocation configuration; identifying a first set of first states in said first execution trace and a second set of second states in said second execution trace as corresponding anchor states; mapping, using the at least one processing device, a first intermediate state between a pair of said anchor states in said first execution trace to a second intermediate state between a pair of said anchor states in said second execution trace using the provenance data such that said second intermediate state is between the anchor states in said second execution trace that correspond to said pair of anchor states in the first execution trace, wherein the provenance data additionally comprises telemetry data, and wherein the mapping is further based on a predefined similarity metric with respect to a consumption of one or more resources by the first intermediate state and the second intermediate state; generating a simulation model of said one or more workflow executions representing a plurality of the different configurations of the resource allocation; and generating, using the at least one processing device, one or more new simulation traces of said one or more workflow executions with one or more resource allocation configurations that are not represented in the provenance data. 9. The system of claim 8 , wherein the first intermediate state and the second intermediate state are between two anchor states that satisfy predefined task completion criteria and the mapping of the first intermediate state to the second intermediate state is based on a linear mapping between the two anchor states. 10. The system of claim 8 , wherein the mapping is based on a completion of tasks, obtained from an analysis of the provenance data, of the plurality of concurrent workflows. 11. The system of claim 8 , wherein the provenance data indicates a number of substantially completed tasks of each type for each of the plurality of concurrent workflows, and wherein the mapping is based on a partial completion of tasks, obtained from an analysis of the provenance data, of the plurality of concurrent workflows based on a task completion cost function. 12. The system of claim 8 , wherein the consumption of resources by the first intermediate state and the second intermediate state correlate with a type of work being executed for the first state and the second state, and wherein the mapping further comprises the step of aligning the telemetry data for two different workflow executions. 13. The system of claim 8 , wherein said mapping is performed for a first plurality of said intermediate states in said first execution trace to a second plurality of said intermediate states in said second execution trace such that ordering relations between states in said first plurality of intermediate states is preserved in said second plurality of intermediate states. 14. The system of claim 8 , further comprising adjusting an allocation of at least one resource for an execution of said plurality of concurrent workflows based on said mapping. 15. A computer program product, comprising a non-transitory machine-readable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed by at least one processing device perform the following steps: obtaining a state of one or more workflow executions of a plurality of concurrent workflows in a shared infrastructure environment with a plurality o

Assignees

Inventors

Classifications

  • Workflow analysis · CPC title

  • G06F30/20Primary

    Design optimisation, verification or simulation (optimisation, verification or simulation of circuit designs G06F30/30) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11263369B2 cover?
Techniques are provided for workflow simulation using provenance data similarity and sequence alignment. An exemplary method comprises: obtaining a state of workflow executions of concurrent workflows with multiple resource allocation configurations, wherein the state comprises provenance data of the concurrent workflows; obtaining execution traces of the concurrent workflows representing diffe…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06Q10/0633. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 01 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).