Sparse metadata segment tree for efficient file storage operations in evolving backup workloads

US9916203B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9916203-B1
Application numberUS-201514755114-A
CountryUS
Kind codeB1
Filing dateJun 30, 2015
Priority dateJun 30, 2015
Publication dateMar 13, 2018
Grant dateMar 13, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments are directed to a method of minimizing latency and input/output (I/O) operations in a data storage system by defining a sparse metadata segment tree to identify changed data blocks, wherein a full version of the tree is stored in a memory and modified versions of the tree are stored in cache memory, and using the sparse metadata segment tree to perform at least one data storage application including file verification, file replication, file restores, and file system snapshots.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of minimizing latency and input/output (I/O) operations in a data storage system, comprising: defining a sparse metadata segment tree to identify changed data blocks, wherein a full version of the tree is stored in a memory and modified versions of the tree are stored in cache memory, wherein the memory comprises one of a hard drive (HDD) memory or solid state (SSD) memory; using the sparse metadata segment tree to perform at least one data storage application selected from the group consisting of: file verification, file replication, file restores, and file system snapshots; caching changed data references to the changed data blocks as metadata segments; and determining if a minimum number of changed data references have been cached and if so, updating a base metadata segment with a new metadata segment and fingerprint calculated over modified segments from a base fingerprint. 2. The method of claim 1 further comprising: storing a sparse metadata segment the memory; indexing the new metadata segment by the new fingerprint that contains a reference to the base fingerprint; and upon a read to the updated metadata segment, issue the read to the base metadata segment and the sparse metadata segment to merge and form a full segment in the memory. 3. The method of claim 2 wherein the sparse metadata segment tree comprises an incremental tree structure based on a Merkle tree scheme. 4. The method of claim 3 wherein the sparse metadata segment tree complements a full segment tree comprising a sum of full segments stored in the memory, and wherein the at least one data storage application reads only the sparse metadata segment tree to find changes in the files between backup operations. 5. The method of claim 4 wherein the amount of metadata read by the at least one data storage application is proportional to a data change rate, and not a metadata change rate. 6. The method of claim 5 wherein the file verification application detects portions of a file that have been changed to verify a modified file, and wherein the sparse metadata segment tree contains changed data that is checked by the verification application. 7. The method of claim 5 wherein the verification process further verifies the merged tree at time of merging of the sparse tree with the original tree. 8. The method of claim 5 wherein the file replication application identifies changes between a last version and a present version, and wherein the sparse metadata segment tree is used to replicate only a changed part of the file. 9. The method of claim 5 wherein the file restore application comprises an incremental file restore process that determines differences between a present good version of a file and a recent lost version of the file, and wherein the sparse metadata segment tree is used to send only the changed part of the file. 10. The method of claim 5 wherein the file system snapshot application maintains snapshots of the file system and the sparse metadata segment tree is used to provide changed information for files in the file system and avoids the need to have a complete metadata representation of each of the files. 11. A system minimizing latency and input/output (I/O) operations for data storage applications, comprising: a first component defining a sparse metadata segment tree to identify changed data blocks, wherein a full version of the tree is stored in a memory and modified versions of the tree are stored in cache memory, wherein the memory comprises one of a hard drive (HDD) memory or solid state (SSD) memory; a second component using the sparse metadata segment tree to perform at least one data storage application of the data storage applications and selected from the group consisting of: file verification, file replication, file restores, and file system snapshots; a cache memory storing changed data references to the changed data blocks as metadata segments, wherein the second component determines if a minimum number of changed data references have been cached and if so, updates a base metadata segment with a new metadata segment and fingerprint calculated over modified segments from a base fingerprint; a persistent memory storing a sparse metadata segment the memory; and a third component indexing the new metadata segment by the new fingerprint that contains a reference to the base fingerprint and, upon a read to the updated metadata segment, issuing the read to the base metadata segment and the sparse metadata segment to merge and form a full segment in the persistent memory. 12. The system of claim 11 wherein the sparse metadata segment tree comprises an incremental tree structure based on a Merkle tree scheme and complements a full segment tree comprising a sum of full segments stored in the memory, and wherein the data storage applications read only the sparse metadata segment tree to find changes in the files between backup operations, and further wherein the amount of metadata read by the at least one data storage application is proportional to a data change rate, and not a metadata change rate. 13. The system of claim 12 wherein the file verification application detects portions of a file that have been changed to verify a modified file, and wherein the sparse metadata segment tree contains changed data that is checked by the verification application, and wherein the verification process further verifies the merged tree at time of merging of the sparse tree with the original tree. 14. The system of claim 12 wherein the file replication application identifies changes between a last version and a present version, and wherein the sparse metadata segment tree is used to replicate only a changed part of the file. 15. The system of claim 12 wherein the file restore application comprises an incremental file restore process that determines differences between a present good version of a file and a recent lost version of the file, and wherein the sparse metadata segment tree is used to send only the changed part of the file. 16. The system of claim 12 wherein the file system snapshot application maintains snapshots of the file system and the sparse metadata segment tree is used to provide changed information for files in the file system and avoids the need to have a complete metadata representation of each of the files. 17. A computer program product, comprising a non-transitory computer-readable medium having a computer-readable program code embodied therein, the computer-readable program code adapted to be executed by one or more processors to minimizing latency and input/output (I/O) operations in a data storage system, comprising: defining a sparse metadata segment tree to identify changed data blocks, wherein a full version of the tree is stored in a memory and modified versions of the tree are stored in cache memory, wherein the memory comprises one of a hard drive (HDD) memory or solid state (SSD) memory; using the sparse metadata segment tree to perform at least one data storage application selected from the group consisting of: file verification, file replication, file restores, and file system snapshots; caching changed data references to the changed data blocks as metadata segments; and determining if a minimum number of changed data references have been cached and if so, updating a base metadata segment with a new metadata segment and fingerprint calculated over modified segments from a base fingerprint.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9916203B1 cover?
Embodiments are directed to a method of minimizing latency and input/output (I/O) operations in a data storage system by defining a sparse metadata segment tree to identify changed data blocks, wherein a full version of the tree is stored in a memory and modified versions of the tree are stored in cache memory, and using the sparse metadata segment tree to perform at least one data storage appl…
Who is the assignee on this patent?
Emc Corp, Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F11/1451. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 13 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).