Deduplication of virtual machine content

US10678448B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10678448-B2
Application numberUS-201715804382-A
CountryUS
Kind codeB2
Filing dateNov 6, 2017
Priority dateNov 4, 2014
Publication dateJun 9, 2020
Grant dateJun 9, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for managing, storing, and serving data within a virtualized environment are described. In some embodiments, a data management system may manage the extraction and storage of virtual machine snapshots, provide near instantaneous restoration of a virtual machine or one or more files located on the virtual machine, and enable secondary workloads to directly use the data management system as a primary storage target to read or modify past versions of data. The data management system may allow a virtual machine snapshot of a virtual machine stored within the system to be directly mounted to enable substantially instantaneous virtual machine recovery of the virtual machine.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for operating a data management system, comprising: acquiring an instruction from a computing device; detecting that a second version of a second virtual machine should be generated based on the instruction; concurrently acquiring a base image associated with a first version of a first virtual machine different from the second virtual machine from a first storage device of a first type while acquiring one or more incremental files associated with the second virtual machine from a second storage device of a second type in response to detecting that the second version of the second virtual machine should be generated; generating at least a portion of a file corresponding with the second version of the second virtual machine by patching the one or more incremental files associated with the second virtual machine to the base image associated with the first virtual machine; acquiring a full image snapshot corresponding with a third version of a third virtual machine; generating a third signature for the third version of the third virtual machine, the generating the third signature includes generating a plurality of hash values corresponding with a plurality of data blocks within the full image snapshot, the plurality of data blocks is arranged such that data blocks of a first plurality of the plurality of data blocks are spaced at a fixed distance from each other and data blocks of a second plurality of the plurality of data blocks are spaced at monotonically increasing distances from each other; comparing the third signature for the third version of the third virtual machine with a first signature for the first version of the first virtual machine; generating a dependent base file comprising data differences between the first version of the first virtual machine and the third version of the third virtual machine; storing the dependent base file using the second storage device of the second type; acquiring a second instruction from the computing device; detecting that the third version of the third virtual machine should be generated based on the second instruction; concurrently acquiring the base image associated with the first version of the first virtual machine from the first storage device of the first type while acquiring the dependent base file from the second storage device of the second type in response to detecting that the third version of the third virtual machine should be generated, generating at least a portion of a fourth file corresponding with the third version of the third virtual machine using the base image associated with the first version of the first virtual machine and the dependent base file; and transmitting the at least the portion of the fourth file to the computing device. 2. The method of claim 1 , wherein: the concurrently acquiring the base image associated with the first version of the first virtual machine while acquiring the one or more incremental files associated with the second virtual machine includes sequentially reading the base image from the first storage device of the first type while reading the one or more incremental files from the second storage device of the second type. 3. The method of claim 1 , wherein: the first plurality corresponds with a first data region within the full image snapshot; and the second plurality corresponds with a second data region within the full image snapshot that does not overlap with the first data region. 4. The method of claim 3 , wherein: each data block of the first plurality is separated by a fixed data length; and each data block of the second plurality is separated by an increasing data length. 5. The method of claim 1 , wherein: the plurality of data blocks comprises a plurality of noncontiguous data blocks, each data block of the plurality of noncontiguous data blocks does not overlap with any of the other data blocks of the plurality of noncontiguous data blocks, each data block of the plurality of noncontiguous data blocks does not share a data boundary with any of the other data blocks of the plurality of noncontiguous data blocks. 6. The method of claim 1 , wherein: the generating the plurality of hash values includes applying a cryptographic hash function to each of the plurality of data blocks within the full image snapshot. 7. The method of claim 1 , wherein: the first version of the first virtual machine corresponds with a first state of the first virtual machine at a first point in time; and the second version of the second virtual machine corresponds with a second state of the second virtual machine at a second point in time subsequent to the first point in time. 8. The method of claim 7 , wherein: the base image comprises a full image snapshot of the first virtual machine at the first point in time; and the one or more incremental files comprise a plurality of incremental files. 9. The method of claim 1 , wherein: the first storage device of the first type comprises a magnetic storage device; and the second storage device of the second type comprises a solid-state storage device. 10. The method of claim 9 , wherein: the magnetic storage device comprises a hard disk drive; and the solid-state storage device comprises a solid-state drive. 11. The method of claim 1 , wherein: the first storage device of the first type has a first read speed; and the second storage device of the second type has a second read speed that is faster than the first read speed. 12. A data management system, comprising: a first storage device of a first type; a second storage device of a second type; and one or more processors configured to acquire an instruction from a computing device and determine that a second version of a second virtual machine should be generated based on the instruction, the one or more processors configured to concurrently acquire a base image associated with a first version of a first virtual machine different from the second virtual machine from the first storage device of the first type while a plurality of incremental files associated with the second virtual machine are acquired from the second storage device of the second type, the one or more processors configured to generate at least a portion of a file corresponding with the second version of the second virtual machine via application of the plurality of incremental files associated with the second virtual machine to the base image associated with the first virtual machine, the one or more processors configured to transmit the at least the portion of the file to the computing device, wherein the one or more processors configured to acquire a full image snapshot corresponding with a third version of a third virtual machine and generate a third signature for the third version of the third virtual machine, the one or more processors configured to generate a plurality of hash values corresponding with a plurality of data blocks within the full image snapshot, the plurality of data blocks is arranged such that data blocks of a first plurality of the plurality of data blocks are spaced at a fixed distance from each other and data blocks of a second plurality of the plurality of data blocks are spaced at monotonically increasing distances from each other, the one or more processors configured to compare the third signature for the third version of the third virtual machine with a first signature for the first version of the first virtual machine, the one or more processors configured to generate a dependent base file comprising data differences between the first version of the first virtual machine and the third version of the version virtual m

Assignees

Inventors

Classifications

  • Replication mechanisms · CPC title

  • by checking functioning · CPC title

  • Management of the data involved in backup or backup restore · CPC title

  • using de-duplication of the data · CPC title

  • Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10678448B2 cover?
Methods and systems for managing, storing, and serving data within a virtualized environment are described. In some embodiments, a data management system may manage the extraction and storage of virtual machine snapshots, provide near instantaneous restoration of a virtual machine or one or more files located on the virtual machine, and enable secondary workloads to directly use the data manage…
Who is the assignee on this patent?
Rubrik Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/0619. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 09 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).