Garbage collection scope detection for distributed storage

US10061697B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10061697-B2
Application numberUS-201615193142-A
CountryUS
Kind codeB2
Filing dateJun 27, 2016
Priority dateDec 16, 2015
Publication dateAug 28, 2018
Grant dateAug 28, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for determining garbage collection (GC) scope in a distribute storage system using chunk-based storage. The systems and methods are compatible with multi-version concurrency control (MVCC) semantics.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for use with a distributed storage system comprising a plurality of storage devices, the method comprising: initializing a garbage collection (GC) front value as the maximum sequence number associated with a storage chunk within a set of consecutively sealed storage chunks, the storage chunks corresponding to storage capacity allocated within the storage devices and having associated sequence numbers; using the GC front value to determine GC scope, the GC scope including zero or more of the storage chunks; retrieving metadata information about the GC scope storage chunks; identifying unreferenced storage chunks from the GC scope storage chunks using the metadata information; and reclaiming storage capacity corresponding to the unreferenced storage chunks. 2. The method of claim 1 further comprising: sealing additional storage chunks; and advancing the GC front value if the additional sealed storage chunk have sequence numbers consecutive to a previous GC front value. 3. The method of claim 2 wherein the storage chunks are used to store search tree elements, wherein advancing the GC front value comprises advancing the GC front value unless a search tree is being updated. 4. The method of claim 3 further comprising: setting a first GC front block when a first search tree update commences; and setting a second GC front block when a second search tree update commences, wherein advancing the GC front value comprises advancing the GC front value after the first search tree update completes to the value of the second GC front block. 5. The method of claim 2 wherein sealing additional storage chunks comprises sealing additional storage chunks in response to a timeout expiring. 6. The method of claim 1 wherein retrieving metadata information about the GC candidate storage chunks comprises looking up metadata information in a metadata table using storage chunk sequence numbers. 7. A distributed storage system comprising: a plurality of storage devices; two or more storage nodes configured to: initialize a garbage collection (GC) front value as the maximum sequence number associated with a storage chunk within a set of consecutively sealed storage chunks, the storage chunks corresponding to storage capacity allocated within the storage devices and having associated sequence numbers; use the GC front value to determine GC scope, the GC scope including zero or more of the storage chunks; retrieve metadata information about the GC scope storage chunks; identify unreferenced storage chunks from the GC scope storage chunks using the metadata information; and reclaim storage capacity corresponding to the unreferenced storage chunks. 8. The distributed storage system of claim 7 wherein the two or more storage nodes are further configured to: seal additional storage chunks; and advance the GC front value if the additional sealed storage chunk have sequence numbers consecutive to a previous GC front value. 9. The distributed storage system of claim 8 wherein the storage chunks are used to store search tree elements, wherein the two or more storage nodes are configured to advance the GC front value unless a search tree is being updated. 10. The distributed storage system of claim 9 wherein the two or more storage nodes are configured to: set a first GC front block when a first search tree update commences; set a second GC front block when a second search tree update commences; and advance the GC front value after the first search tree update completes to the value of the second GC front block. 11. The distributed storage system of claim 8 wherein the two or more storage nodes are configured to seal additional storage chunks in response to a timeout expiring. 12. The distributed storage system of claim 7 wherein the two or more storage nodes are configured lookup metadata information in a metadata table using storage chunk sequence numbers.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10061697B2 cover?
Systems and methods for determining garbage collection (GC) scope in a distribute storage system using chunk-based storage. The systems and methods are compatible with multi-version concurrency control (MVCC) semantics.
Who is the assignee on this patent?
Emc Corp, Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F12/0261. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 28 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).