Log-structured formats for managing archived storage of objects

US11436102B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11436102-B2
Application numberUS-202016998060-A
CountryUS
Kind codeB2
Filing dateAug 20, 2020
Priority dateAug 20, 2020
Publication dateSep 6, 2022
Grant dateSep 6, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Solutions for managing archived storage include receiving, at a first node, a snapshot comprising object data (e.g., a virtual machine disk snapshot) from a second node (e.g., a software defined data center), and storing the snapshot in a tiered structure that includes a data tier and a metadata tier. Snapshots may be used for fail-over operations and/or backups, to support disaster recovery. The data tier comprises a log-structured file system (LFS), and the metadata tier comprises a content addressable storage (CAS) identifying addresses within the LFS. The metadata tier also comprises a logical layer indicating content in the CAS. Segment cleaning of the data tier is performed using a segment usage table (SUT). Some examples include performing a fail-over operation from the second node to a third node using at least the stored snapshot for workload recovery. In some examples, the CAS comprises a log-structured merge-tree (LSM-tree).

First claim

Opening claim text (preview).

What is claimed is: 1. A method of managing archived storage, the method comprising: receiving, at a first node, from an upload agent at a second node, a snapshot comprising object data; storing the snapshot in a primary storage in a tiered structure, wherein the tiered structure comprises a data tier and a metadata tier, wherein the data tier comprises a log-structured file system (LFS) for storing the snapshot, wherein the metadata tier comprises a content addressable storage (CAS) identifying addresses within the LFS, and wherein the metadata tier further comprises a logical layer indicating content in the CAS; and performing segment cleaning of the data tier using a segment usage table (SUT). 2. The method of claim 1 , further comprising: performing a backup using at least the stored snapshot; or performing a fail-over operation from the second node to a third node using at least the stored snapshot for workload recovery. 3. The method of claim 1 , wherein the segment cleaning comprises: determining, based at least on numbers of live blocks indicated in the SUT, a plurality of segment cleaning candidates; and for each segment cleaning candidate of the plurality of segment cleaning candidates: determining whether a block in the segment cleaning candidate is live; based at least on the block not being live, marking the block as free; and based at least on the block being live, including the block in a coalescing operation. 4. The method of claim 1 , further comprising: based at least on access costs, calculating an expected cost of a segment cleaning operation; based at least on storage costs, calculating an expected cost savings from the segment cleaning; based at least on the expected cost of the segment cleaning and the expected cost savings from the segment cleaning, determining whether to perform the segment cleaning; and based at least on making a determination to perform the segment cleaning, performing the segment cleaning. 5. The method of claim 1 , further comprising: performing deduplication of the snapshot using at least the CAS; and based at least on a retention schedule, deleting at least a portion of the snapshot or moving at least a portion of the snapshot from the primary storage to a long-term storage. 6. The method of claim 1 , wherein the CAS comprises a log-structured merge-tree (LSM-tree). 7. The method of claim 1 , wherein the second node comprises a software defined data center (SDDC), and wherein the snapshot comprises a versioned object difference. 8. A computer system for managing archived storage, the computer system comprising: a processor; and a non-transitory computer readable medium having stored thereon program code for transferring data to another computer system, the program code causing the processor to: receive, at a first node, from an upload agent at a second node, a snapshot comprising object data; store the snapshot in a primary storage in a tiered structure, wherein the tiered structure comprises a data tier and a metadata tier, wherein the data tier comprises a log-structured file system (LFS) for storing the snapshot, wherein the metadata tier comprises a content addressable storage (CAS) identifying addresses within the LFS, and wherein the metadata tier further comprises a logical layer indicating content in the CAS; and perform segment cleaning of the data tier using a segment usage table (SUT). 9. The computer system of claim 8 , wherein the program code is further operative to: perform a backup using at least the stored snapshot; or perform a fail-over operation from the second node to a third node using at least the stored snapshot for workload recovery. 10. The computer system of claim 8 , wherein the program code is further operative to: determine, based at least on numbers of live blocks indicated in the SUT, a plurality of segment cleaning candidates; and for each segment cleaning candidate of the plurality of segment cleaning candidates: determine whether a block in the segment cleaning candidate is live; based at least on the block not being live, mark the block as free; and based at least on the block being live, include the block in a coalescing operation. 11. The computer system of claim 8 , wherein the program code is further operative to: based at least on access costs, calculate an expected cost of a segment cleaning operation; based at least on storage costs, calculate an expected cost savings from the segment cleaning; based at least on the expected cost of the segment cleaning and the expected cost savings from the segment cleaning, determine whether to perform the segment cleaning; and based at least on making a determination to perform the segment cleaning, perform the segment cleaning. 12. The computer system of claim 8 , wherein the program code is further operative to: perform deduplication of the snapshot using at least the CAS; and based at least on a retention schedule, delete at least a portion of the snapshot or move at least a portion of the snapshot from the primary storage to a long-term storage. 13. The computer system of claim 8 , wherein the CAS comprises a log-structured merge-tree (LSM-tree). 14. The computer system of claim 8 , wherein the second node comprises a software defined data center (SDDC), and wherein the snapshot comprises a versioned object difference. 15. A non-transitory computer readable storage medium having stored thereon program code executable by a first computer system at a first site, the program code embodying a method comprising: receiving, at a first node, from an upload agent at a second node, a snapshot comprising object data; storing the snapshot in a primary storage in a tiered structure, wherein the tiered structure comprises a data tier and a metadata tier, wherein the data tier comprises a log-structured file system (LFS) for storing the snapshot, wherein the metadata tier comprises a content addressable storage (CAS) identifying addresses within the LFS, and wherein the metadata tier further comprises a logical layer indicating content in the CAS; and performing segment cleaning of the data tier using a segment usage table (SUT). 16. The non-transitory computer readable storage medium of claim 15 , wherein the program code further comprises: performing a backup using at least the stored snapshot; or performing a fail-over operation from the second node to a third node using at least the stored snapshot for workload recovery. 17. The non-transitory computer readable storage medium of claim 15 , wherein the program code further comprises: determining, based at least on numbers of live blocks indicated in the SUT, a plurality of segment cleaning candidates; and for each segment cleaning candidate of the plurality of segment cleaning candidates: determining whether a block in the segment cleaning candidate is live; based at least on the block not being live, marking the block as free; and based at least on the block being live, including the block in a coalescing operation. 18. The non-transitory computer readable storage medium of claim 15 , wherein the program code further comprises: based at least on access costs, calculating an expected cost of a segment cleaning operation; based at least on storage costs, calculating an expected cost savings from the segment cleaning; based at least on the expected cost of the segment cleaning and the expected cost savings from the segment cleaning, determining whether to perform the segment cleaning;

Assignees

Inventors

Classifications

  • involving logging of persistent data for recovery · CPC title

  • Backup scheduling policy · CPC title

  • Trees, e.g. B+trees · CPC title

  • for networked environments · CPC title

  • Saving storage space on storage systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11436102B2 cover?
Solutions for managing archived storage include receiving, at a first node, a snapshot comprising object data (e.g., a virtual machine disk snapshot) from a second node (e.g., a software defined data center), and storing the snapshot in a tiered structure that includes a data tier and a metadata tier. Snapshots may be used for fail-over operations and/or backups, to support disaster recovery. T…
Who is the assignee on this patent?
Vmware Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/1451. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 06 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).