Efficient capacity management for a data storage system

US11093163B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11093163-B2
Application numberUS-201916408694-A
CountryUS
Kind codeB2
Filing dateMay 10, 2019
Priority dateMay 10, 2019
Publication dateAug 17, 2021
Grant dateAug 17, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed technology generally describes separating types of data chunks in a copy-on-write/MVCC B+ tree, chunk-based data storage system, and also allocating the sizes of leaf chunks to be smaller than that of other (e.g., internal and root node) chunks. By having leaf chunks separate from node chunks, the probability of having a fully reclaimable (without copying) chunk is increased. Similarly, by having smaller sized leaf chunks relative to node chunks, the probability of having a fully reclaimable (without copying) leaf chunks is increased. The technology thus facilitates more efficient garbage collection.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a processor; and a memory that stores executable instructions that, when executed by the processor, facilitate performance of operations, the operations comprising: maintaining a tree corresponding to internal node chunks and leaf chunks, comprising allocating the leaf chunks with a defined leaf chunk size based on a probability of leaves of a leaf chunk being updated; allocating the internal node chunks with a defined internal node chunk size based on a probability of an internal node chunk being updated, wherein the defined internal node chunk size is equal to twelve times the defined leaf chunk size; and reclaiming storage capacity corresponding to the leaf chunk when the leaves have been updated. 2. The system of claim 1 , wherein the allocating the leaf chunk with the defined leaf chunk size based on the probability of the leaves of the leaf chunk being updated comprises allocating the leaf chunk based on a number of leaves in the leaf chunk. 3. The system of claim 1 , wherein the allocating the leaf chunk with the defined leaf chunk size based on the probability of the leaves of the leaf chunk being updated comprises allocating the leaf chunk to correspond to a size of a chunk fragment. 4. The system of claim 1 , wherein the operations further comprise, updating a leaf in the leaf chunk. 5. The system of claim 4 , wherein the operations further comprise, in response to the updating the leaf in the leaf chunk, updating the internal node chunk of the tree, wherein the internal node chunk is a parent node to the leaf chunk. 6. The system of claim 1 , wherein the allocating the leaf chunk with the defined leaf chunk size based on the probability of the leaves of the leaf chunk being updated comprises allocating the leaf chunks with the defined leaf chunk size that is smaller than a size of the internal node chunks. 7. The system of claim 1 , wherein the operations further comprise allocating a root node with a size equal to the defined internal node chunk size. 8. The system of claim 1 , wherein the operations further comprise allocating nodes with a size based on a level of each node in a B+ tree. 9. The system of claim 1 , wherein the tree is a B+ tree. 10. A method, comprising, configuring, in a system comprising a processor, a tree comprising a root node chunk, internal node chunks and leaf chunks, the configuring comprising allocating the internal node chunks with a predetermined internal node chunk size based on a first probability of an internal node chunk being updated, and allocating the leaf chunks with a predetermined leaf chunk size based on a second probability of leaves of a leaf chunk being updated, wherein the predetermined internal node chunk size that is equal to twelve times the predetermined leaf chunk size; and maintaining the tree, comprising updating the leaf chunk, and reclaiming storage capacity of the leaf chunk in response to the leaf chunk being determined to have low capacity utilization relative to a specified low capacity utilization criterion. 11. The method of claim 10 , wherein the reclaiming the leaf chunk in response to the leaf chunk being determined to have the low capacity utilization relative to the specified low capacity utilization criterion comprises reclaiming the leaf chunk when leaves in the leaf chunk have been updated. 12. The method of claim 10 , wherein the allocating the leaf chunks with the predetermined leaf chunk size comprises allocating the leaf chunks based on a number of erasure coding data fragments. 13. The method of claim 10 , wherein the configuring further comprises allocating a root node with a size equal to the predetermined internal node chunk size. 14. A non-transitory machine-readable medium, comprising executable instructions that, when executed by a processor, facilitate performance of operations, the operations comprising: allocating storage components of a tree, comprising allocating a root node chunk, allocating internal node chunks to each have a same or substantially the same internal node chunk size, and allocating leaf chunks with a leaf chunk size that corresponds to erasure coding fragments, and the leaf chunk size is equal to one twelfth of the internal node chunk size; and maintaining the tree, comprising updating a leaf chunk, and reclaiming storage capacity of the leaf chunk when the leaf chunk has low capacity utilization relative to a specified low capacity utilization criterion. 15. The non-transitory machine-readable medium of claim 14 , wherein the reclaiming the leaf chunk when the leaf chunk has the low capacity utilization relative to the specified low capacity utilization criterion comprises reclaiming the leaf chunk when leaves in the leaf chunk have been updated. 16. The non-transitory machine-readable medium of claim 15 , wherein the reclaiming the storage capacity of the leaf chunk when the leaf chunk has the low capacity utilization relative to the specified low capacity utilization criterion comprises reclaiming the storage capacity of the leaf chunk when the leaf chunk contains no leaves that are in use. 17. The non-transitory machine-readable medium of claim 14 , wherein the root node chunk has a size equal to the internal node chunk size. 18. The non-transitory machine-readable medium of claim 14 , wherein the tree is a B+ tree. 19. The non-transitory machine-readable medium of claim 14 , wherein the operations further comprise, updating a leaf in the leaf chunk. 20. The non-transitory machine-readable medium of claim 19 , wherein the operations further comprise, in response to the updating the leaf in the leaf chunk, updating an internal node chunk of the tree, wherein the internal node chunk is a parent node to the leaf chunk.

Assignees

Inventors

Classifications

  • Saving storage space on storage systems · CPC title

  • G06F3/0652Primary

    Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • Non-volatile semiconductor memory arrays · CPC title

  • G06F3/0616Primary

    in relation to life time, e.g. increasing Mean Time Between Failures [MTBF] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11093163B2 cover?
The disclosed technology generally describes separating types of data chunks in a copy-on-write/MVCC B+ tree, chunk-based data storage system, and also allocating the sizes of leaf chunks to be smaller than that of other (e.g., internal and root node) chunks. By having leaf chunks separate from node chunks, the probability of having a fully reclaimable (without copying) chunk is increased. Simi…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F3/0652. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 17 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).