Workflow service using state transfer
US-2017093988-A1 · Mar 30, 2017 · US
US9684543B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9684543-B1 |
| Application number | US-201715425749-A |
| Country | US |
| Kind code | B1 |
| Filing date | Feb 6, 2017 |
| Priority date | Feb 5, 2016 |
| Publication date | Jun 20, 2017 |
| Grant date | Jun 20, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus includes a processor and a storage storing instructions causing the processor to: maintain a federated area; receive a request to perform a job flow with a data set from a remote device; retrieve a job flow definition specifying the tasks of the job flow from the federated area; determine whether there is an instance log in the federated area generated by a previous performance of the job flow with the data set; in response to there being such an instance log, compare the version specified in the instance log of each task routine for each task to the most recent version stored in the federated area; and in response to each version specified in the instance log matching the most recent version, provide the remote device with access to a result report generated by the previous performance in lieu of generating a new result report.
Opening claim text (preview).
The invention claimed is: 1. An apparatus comprising a processor and a storage to store instructions that, when executed by the processor, cause the processor to perform operations comprising: maintain, within one or more storage devices, a federated area to store multiple data sets, multiple job flow definitions, multiple task routines, multiple result reports and multiple instance logs; provide, on a network, a portal to control access by a remote device to the federated area via the network; receive, at the portal, and from the remote device via the network, a first request to execute at least one task routine stored in the federated area to perform at least one corresponding task of a job flow described in a job flow definition stored in the federated area with at least one data set stored in the federated area, wherein the first request specifies the job flow definition and the at least one data set; retrieve the job flow definition from among the multiple job flow definitions stored in the federated area, wherein the job flow definition comprises a flow task identifier to identify each task of the job flow and specifies a relative order in which each task is to be performed in the job flow; for each task of the job flow, retrieve, from among the multiple task routines stored in the federated area, a most recent version of the corresponding task routine of the at least one task routine; determine whether there is an instance log among the multiple instance logs stored in the federated area that was generated by a previous performance of the at least one task of the job flow with the at least one data set; and in response to a determination that there is an instance log among the multiple instance logs stored in the federated area that was generated by a previous performance of the at least one task of the job flow with the at least one data set, perform operations comprising: retrieve, from among the multiple task routines stored in the federated area, a version specified by the instance log of each task routine of the at least one task routine; for each task of the at least one task of the job flow, compare the version specified by the instance log of each task routine of the at least one task routine to the most recent version of each task routine of the at least one task routine; in response to each version specified by the instance log of each task routine of the at least one task routine matching the most recent version of the same task routine, perform operations comprising: retrieve a result report that was generated by the previous performance of the at least one task of the job flow along with the instance log; and provide access to the result report to the remote device via the network; and in response to a determination that there is no instance log among the multiple instance logs stored in the federated area that was generated by a previous performance of the at least one task of the job flow with the at least one data set, perform operations comprising: retrieve the at least one data set from among the multiple data sets stored in the federated area; execute the most recent version of each task routine of the at least one task routine to perform the at least one corresponding task of the job flow with the at least one data set to generate a new result report and a new instance log; store the new result report among the multiple result reports in the federated area; store the new instance log among the multiple instance logs in the federated area; and provide access to the new result report to the remote device via the network. 2. The apparatus of claim 1 , wherein, in the generation of the new instance log, the processor is caused to perform operations comprising: take at least a first hash of the at least one data set; take at least a second hash of the retrieved version of a task routine of the at least one task routine; take at least a third hash of the new result report; concatenate at least the first, second and third hashes to generate a string; and generate the new instance log to comprise the string. 3. The apparatus of claim 1 , wherein the processor is caused to, in response to one version specified by the instance log of a task routine of the at least one task routine not matching the most recent version of the same task routine, perform operations comprising: retrieve the at least one data set from among the multiple data sets stored in the federated area; starting with an earliest task to be performed of the at least one task of the job flow indicated in the job flow definition, identify the earliest task for which the version of the corresponding task routine specified by the instance log does not match the most recent version of the same task routine; for each task of the at least one task of the job flow, starting with the identified earliest task, execute the most recent version of the corresponding task routine of the at least one task routine to generate a new result report and a new instance log; store the new result report among the multiple result reports in the federated area; store the new instance log among the multiple instance logs in the federated area; and provide access to the new result report to the remote device via the network. 4. The apparatus of claim 1 , wherein: the determination of whether there is an instance log among the multiple instance logs stored in the federated area that was generated by a previous performance of the at least one task of the job flow with the at least one data set comprises a determination, by the processor, of whether there is more than one instance log among the multiple instance logs that were each generated by a previous performance of the at least one task of the job flow with the at least one data set; and in response to a determination that there is more than one instance log among the multiple instance logs that were each generated by a previous performance of the at least one task of the job flow with the at least one data set, the processor is caused to retrieve, from among the multiple task routines stored in the federated area, a version of each task routine of the at least one task routine specified by the most recently generated one of the more than one instance logs. 5. The apparatus of claim 1 , wherein, in the comparison of the version specified by the instance log of each task routine to the most recent version of each task routine for each task of the at least one task of the job flow, the processor is caused to compare a hash taken of the version specified by the instance log of each task routine to a hash taken of the most recent version of each task routine for each task of the at least one task of the job flow. 6. The apparatus of claim 1 , wherein, in the determination of whether there is an instance log stored among the multiple instance logs that was generated by a previous performance of the at least one task of the job flow with the at least one data set, the processor is caused to perform operations comprising: use at least one data set identifier of the at least one data set and a flow identifier of the job flow as portions of an index to a location in the federated area; and search the location for an instance log that was generated by a previous performance of the at least one task of the job flow with the at least one data set. 7. The apparatus of claim 1 , wherein the processor is caused to perform operations comprising: receive, at the portal, and from a source device via the network, a second request to store a task routine among the multiple task routines in the federated area, wherein the task routine comprises a flow task identifier to indicate a corresponding task that is performed when the task routine is executed;
Task life-cycle, e.g. stopping, restarting, resuming execution (G06F9/4881 takes precedence) · CPC title
in which an application is distributed across nodes in the network (software deployment G06F8/60; multiprogramming arrangements G06F9/46) · CPC title
for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title
Techniques for rebalancing the load in a distributed system · CPC title
by using string matching techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.