Data provenance in computing infrastructure

US9710332B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9710332-B1
Application numberUS-201414151228-A
CountryUS
Kind codeB1
Filing dateJan 9, 2014
Priority dateDec 21, 2011
Publication dateJul 18, 2017
Grant dateJul 18, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are disclosed for generating data provenance associated with a computing system. For example, a method comprises the following steps. Information associated with the execution of a given process in a given computing environment in accordance with a given process data set is captured. A provenance data set is generated based on the captured information. The generated provenance data set comprises one or more states associated with one or more execution components of the given computing environment that existed during execution of the given process, the one or more execution components comprising one or more virtual machines and one or more storage units. At least a portion of the generated provenance data set may be utilized to revert the computing environment back to the one or more states associated with the one or more execution components of the given computing environment that existed during the execution of the given process.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising the steps of: capturing information during execution of a given application in a given computing environment, wherein the given application is executed by execution components of the given computing environment in accordance with a given application data set, the execution components comprising one or more virtual machines and one or more associated external storage volumes, and wherein the information capturing step further comprises capturing one or more complex asset representations of the given computing environment, each complex asset representation being an abstraction of the given computing environment for the given application across the execution components of the given computing environment, and wherein each complex asset representation comprises state information for a given time corresponding to the execution components of the given computing environment; generating a provenance data set based on the captured information, the generated provenance data set comprising provenance metadata associated with two or more provenance nodes corresponding to respective ones of the one or more captured complex asset representations, wherein each of the provenance nodes represents one or more states associated with the execution components of the given computing environment that existed during the execution of the given application at a particular time; providing the provenance data via a user interface in the form of a directed acyclic graph, wherein the user interface is configured to browse provenance of a current version of the given application data set by traversing the two or more provenance nodes in the directed acyclic graph, and wherein the user interface is further configured to query the provenance metadata associated with individual ones of the two or more provenance nodes; and utilizing at least a portion of the generated provenance data set to revert the execution components of the given computing environment back to at least one of the one or more states associated with the execution components of the given computing environment that existed during the execution of the given application responsive to input received via the user interface selecting a given one of the two or more provenance nodes in the directed acyclic graph, wherein the execution components of the given computing environment are reverted back to the at least one state by: extracting the complex asset representation for the at least one state from the provenance data set; unlocking the one or more external storage volumes corresponding to the extracted complex asset representation; resuming the one or more virtual machines corresponding to the extracted complex asset representation; and re-executing the given application for the given application data set with the one or more unlocked external storage volumes and the one or more resumed virtual machines; wherein the steps are performed by at least one processing device comprising a processor coupled to a memory. 2. The method of claim 1 , wherein the execution components of the given computing environment comprise one or more virtual resources. 3. The method of claim 1 , wherein the execution components of the given computing environment comprise one or more processing resources. 4. The method of claim 1 , wherein the execution components of the given computing environment comprise one or more storage resources. 5. The method of claim 1 , wherein the given computing environment comprises a cloud computing environment. 6. The method of claim 1 , further comprising creating a new provenance node in response to a request received via the user interface. 7. The method of claim 1 , further comprising creating a new provenance node in response to a scheduled event triggered according to a provenance policy. 8. The method of claim 1 , further comprising creating a new provenance node in response to receiving a request to generate a new complex asset snapshot. 9. The method of claim 1 , further comprising creating a new provenance node in response to the given application triggering a milestone event. 10. A computer program product comprising a processor-readable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed by the processor of the processing device implement the steps of: capturing information during execution of a given application in a given computing environment, wherein the given application is executed by execution components of the given computing environment in accordance with a given application data set, the execution components comprising one or more virtual machines and one or more associated external storage volumes, and wherein the information capturing step further comprises capturing one or more complex asset representations of the given computing environment, each complex asset being an abstraction of the given computing environment for the given application across the execution components of the given computing environment, and wherein each complex asset representation comprises state information for a given time corresponding to the execution components of the given computing environment; generating a provenance data set based on the captured information, the generated provenance data set comprising provenance metadata associated with two or more provenance nodes corresponding to respective ones of the one or more captured complex asset representations, wherein each of the provenance nodes represents one or more states associated with the execution components of the given computing environment that existed during the execution of the given application at a particular time; providing the provenance data via a user interface in the form of a directed acyclic graph, wherein the user interface is configured to browse provenance of a current version of the given application data set by traversing the two or more provenance nodes in the directed acyclic graph, and wherein the user interface is further configured to query the provenance metadata associated with individual ones of the two or more provenance nodes; and utilizing at least a portion of the generated provenance data set to revert the execution components of the given computing environment back to at least one of the one or more states associated with the execution components of the given computing environment that existed during the execution of the given application responsive to input received via the user interface selecting a given one of the two or more provenance nodes in the directed acyclic graph, wherein the execution components of the given computing environment are reverted back to the at least one state by: extracting the complex asset representation for the at least one state from the provenance data set; unlocking the one or more external storage volumes corresponding to the extracted complex asset representation; resuming the one or more virtual machines corresponding to the extracted complex asset representation; and re-executing the given application for the given application data set with the one or more unlocked external storage volumes and the one or more resumed virtual machines. 11. An apparatus comprising: a memory; and a processor operatively coupled to the memory and configured to: capture information during execution of a given application in a given computing environment, wherein the given application is executed by execution components of the given computing environment in accordance with a given application data set, the execution components comprising one or more virtual machines and one or more associated external storage v

Assignees

Inventors

Classifications

  • at system level · CPC title

  • Saving, restoring, recovering or retrying · CPC title

  • of structured data, e.g. relational data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9710332B1 cover?
Techniques are disclosed for generating data provenance associated with a computing system. For example, a method comprises the following steps. Information associated with the execution of a given process in a given computing environment in accordance with a given process data set is captured. A provenance data set is generated based on the captured information. The generated provenance data s…
Who is the assignee on this patent?
Emc Corp, Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F11/1415. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).