Tracing garbage collector for search trees under multi-version concurrency control

US10402316B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10402316-B2
Application numberUS-201615083324-A
CountryUS
Kind codeB2
Filing dateMar 29, 2016
Priority dateSep 14, 2015
Publication dateSep 3, 2019
Grant dateSep 3, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Structures and processes for garbage collection of search trees under Multi-Version Concurrency Control (MVCC). Such search trees may be used to store data within a distributed storage system. A process detects live search tree elements using tracing and then identify storage chunks having no live elements as garbage to be reclaimed. The process can be paused and resumed to reduce impact on other system processing.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for garbage collection in a storage system having a plurality of storage devices, the method comprising: identifying a plurality of storage chunks that are marked as full, the storage chunks storing a plurality of tree elements, the storage chunks being formed on one or more of the storage devices; classifying each of the plurality of tree elements as either a live tree element or an unreferenced tree element by traversing a plurality of search trees to identify ones of the plurality of tree elements that are currently active, wherein only tree elements in storage chunks that are marked as full are classified as either a live tree element or an unreferenced tree element; and identifying one or more of the plurality of storage chunks that include only unreferenced tree elements and no live tree elements, and reclaiming the identified one or more storage chunks. 2. The method of claim 1 further comprising: saving a checkpoint; pausing the traversing of the plurality of search trees; determining a first one of the tree elements to resume from based upon the checkpoint; and resuming traversing the search trees from the first one of the tree elements. 3. The method of claim 2 wherein pausing traversing the search trees comprises pausing traversing the search trees in response to a search tree update. 4. The method of claim 3 wherein the traversing of the plurality of search trees is paused in response to a search tree journal update. 5. The method of claim 2 wherein saving the checkpoint comprises saving information about a last tree element traversed. 6. The method of claim 5 wherein saving information about the last tree element traversed comprises saving a search key associated with the last tree element traversed. 7. The method of claim 6 further comprising selecting one of the tree elements to resume from based upon the search key associated with the last tree element traversed. 8. The method of claim 2 wherein the traversing of the plurality of search trees is paused on a first node of the storage system and the traversing of the plurality of search trees is resumed on a second node of the storage system. 9. The method of claim 8 wherein the pausing traversing the search trees comprises determining tree ownership changed. 10. A distributed storage system, comprising: a plurality of storage devices; and one or more processors operatively coupled to the plurality of storage devices, the one or more processors being configured to: identify a plurality of storage chunks that are marked as full, the storage chunks storing a plurality of tree elements, the storage chunks being formed on one or more of the storage devices; classify each of the plurality of tree elements as either a live tree element or an unreferenced tree element by traversing a plurality of search trees to identify ones of the plurality of tree elements that are currently active, wherein only tree elements in the storage chunks that are marked as full are classified as either a live tree element or an unreferenced tree element; and identify one or more of the plurality of storage chunks that include only unreferenced tree elements and no live tree elements, and reclaim the identified one or more storage chunks. 11. The distributed storage system of claim 10 wherein the one or more processors are further configured to: save a checkpoint; pause traversing the search trees; determine a first one of the tree elements to resume from based upon the checkpoint; and resume traversing the search trees from the first one of the tree elements. 12. The distributed storage system of claim 11 wherein the one or more processors are further configured to pause traversing the search trees in response to a search tree update. 13. The distributed storage system of claim 12 wherein the traversing of the plurality of search trees is paused in response to a search tree journal update. 14. The distributed storage system of claim 11 wherein the checkpoint includes information about a last tree element traversed. 15. The distributed storage system of claim 14 wherein the checkpoint includes a search key associated with the last tree element traversed. 16. The distributed storage system of claim 15 wherein the one or more processors are further configured to select one of the tree elements to resume from based upon the search key associated with the last tree element traversed. 17. The distributed storage system of claim 11 wherein traversing the search trees is paused on a first node of the distributed storage system and resumed on a second node of the distributed storage system. 18. The distributed storage system of claim 17 wherein the one or more processors are further configured to pause traversing the search trees in response to a change in search tree ownership.

Assignees

Inventors

Classifications

  • using versioning · CPC title

  • Garbage collection, i.e. reclamation of unreferenced memory · CPC title

  • in block erasable memory, e.g. flash memory · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10402316B2 cover?
Structures and processes for garbage collection of search trees under Multi-Version Concurrency Control (MVCC). Such search trees may be used to store data within a distributed storage system. A process detects live search tree elements using tracing and then identify storage chunks having no live elements as garbage to be reclaimed. The process can be paused and resumed to reduce impact on oth…
Who is the assignee on this patent?
Emc Corp, Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/2329. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 03 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).