Deduplication of virtual machine content

US11354046B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11354046-B2
Application numberUS-202016813020-A
CountryUS
Kind codeB2
Filing dateMar 9, 2020
Priority dateNov 4, 2014
Publication dateJun 7, 2022
Grant dateJun 7, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for managing, storing, and serving data within a virtualized environment are described. In some embodiments, a data management system may manage the extraction and storage of virtual machine snapshots, provide near instantaneous restoration of a virtual machine or one or more files located on the virtual machine, and enable secondary workloads to directly use the data management system as a primary storage target to read or modify past versions of data. The data management system may allow a virtual machine snapshot of a virtual machine stored within the system to be directly mounted to enable substantially instantaneous virtual machine recovery of the virtual machine.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method for operating a data management system, comprising: accessing an instruction from a computing device; detecting that a second version of a second virtual machine should be generated based on the instruction; accessing a base image associated with a first version of a first virtual machine different from the second virtual machine from a first storage device while accessing one or more incremental files associated with the second virtual machine from a second storage device in response to detecting that the second version of the second virtual machine should be generated; generating at least a portion of a file corresponding with the second version of the second virtual machine by patching the one or more incremental files associated with the second virtual machine to the base image associated with the first virtual machine; accessing a full image snapshot corresponding with a third version of a third virtual machine; generating a third signature for the third version of the third virtual machine, the generating the third signature includes generating a plurality of hash values corresponding with a plurality of data blocks within the full image snapshot; comparing the third signature for the third version of the third virtual machine with a first signature for the first version of the first virtual machine; generating a dependent base file comprising data differences between the first version of the first virtual machine and the third version of the third virtual machine; storing the dependent base file using the second storage device; accessing a second instruction from the computing device; detecting that the third version of the third virtual machine should be generated based on the second instruction; accessing the base image associated with the first version of the first virtual machine from the first storage device while accessing the dependent base file from the second storage device in response to detecting that the third version of the third virtual machine should be generated; generating at least a portion of a fourth file corresponding with the third version of the third virtual machine using the base image associated with the first version of the first virtual machine and the dependent base file; and transmitting the at least the portion of the fourth file to the computing device. 2. The computer-implemented method of claim 1 , further comprising: sequentially reading the base image from the first storage device while reading the one or more incremental files from the second storage device. 3. The computer-implemented method of claim 1 , wherein: the plurality of hash values corresponds with a first data region within the full image snapshot; and a second plurality of hash values corresponds with a second data region within the full image snapshot that does not overlap with the first data region. 4. The computer-implemented method of claim 3 , wherein: each data block of the plurality of hash values is separated by a fixed data length; and each data block of the second plurality of hash values is separated by an increasing data length. 5. The computer-implemented method of claim 1 , wherein: the plurality of data blocks comprises a plurality of noncontiguous data blocks, and wherein each data block of the plurality of noncontiguous data blocks does not overlap with any of the other data blocks of the plurality of noncontiguous data blocks, and wherein each data block of the plurality of noncontiguous data blocks does not share a data boundary with any of the other data blocks of the plurality of noncontiguous data blocks. 6. The computer-implemented method of claim 1 , further comprising: applying a cryptographic hash function to each of the plurality of data blocks within the full image snapshot. 7. The computer-implemented method of claim 1 , wherein: the first version of the first virtual machine corresponds with a first state of the first virtual machine at a first point in time; and the second version of the second virtual machine corresponds with a second state of the second virtual machine at a second point in time subsequent to the first point in time. 8. The computer-implemented method of claim 7 , wherein: the base image comprises a full image snapshot of the first virtual machine at the first point in time; and the one or more incremental files comprise a plurality of incremental files. 9. The computer-implemented method of claim 1 , wherein: the first storage device comprises a magnetic storage device; and the second storage device comprises a solid-state storage device. 10. The computer-implemented method of claim 9 , wherein: the magnetic storage device comprises a hard disk drive; and the solid-state storage device comprises a solid-state drive. 11. The computer-implemented method of claim 1 , wherein: the first storage device has a first read speed; and the second storage device has a second read speed that is faster than the first read speed. 12. A data management system, comprising: a first storage device; a second storage device; and one or more processors configured to perform operations comprising, at least: accessing an instruction from a computing device; detecting that a second version of a second virtual machine should be generated based on the instruction; accessing a base image associated with a first version of a first virtual machine different from the second virtual machine from the first storage device while accessing one or more incremental files associated with the second virtual machine from the second storage device in response to detecting that the second version of the second virtual machine should be generated; generating at least a portion of a file corresponding with the second version of the second virtual machine by patching the one or more incremental files associated with the second virtual machine to the base image associated with the first virtual machine; accessing a full image snapshot corresponding with a third version of a third virtual machine; generating a third signature for the third version of the third virtual machine, the generating the third signature includes generating a plurality of hash values corresponding with a plurality of data blocks within the full image snapshot; comparing the third signature for the third version of the third virtual machine with a first signature for the first version of the first virtual machine; generating a dependent base file comprising data differences between the first version of the first virtual machine and the third version of the third virtual machine; storing the dependent base file using the second storage device; accessing a second instruction from the computing device; detecting that the third version of the third virtual machine should be generated based on the second instruction; accessing the base image associated with the first version of the first virtual machine from the first storage device while accessing the dependent base file from the second storage device in response to detecting that the third version of the third virtual machine should be generated; generating at least a portion of a fourth file corresponding with the third version of the third virtual machine using the base image associated with the first version of the first virtual machine and the dependent base file; and transmitting the at least the portion of the fourth file to the computing device. 13. The data management system of claim 12 , wherein the operations further comprise: sequentially reading the base image from the first storage

Assignees

Inventors

Classifications

  • Hypervisor-specific management and integration aspects · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • involving virtual machines · CPC title

  • G06F3/0619Primary

    in relation to data integrity, e.g. data losses, bit errors · CPC title

  • for networked environments · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11354046B2 cover?
Methods and systems for managing, storing, and serving data within a virtualized environment are described. In some embodiments, a data management system may manage the extraction and storage of virtual machine snapshots, provide near instantaneous restoration of a virtual machine or one or more files located on the virtual machine, and enable secondary workloads to directly use the data manage…
Who is the assignee on this patent?
Rubrik Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/45558. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 07 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).