Issuing efficient writes to erasure coded objects in a distributed storage system via adaptive logging

US11467746B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11467746-B2
Application numberUS-202017089605-A
CountryUS
Kind codeB2
Filing dateNov 4, 2020
Priority dateApr 7, 2020
Publication dateOct 11, 2022
Grant dateOct 11, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for issuing efficient writes to an erasure coded storage object in a distributed storage system via adaptive logging are provided. In one set of embodiments, a node of the system can receive a write request for updating one or more logical data blocks of the storage object and determine whether a size of the one or more logical data blocks meets or exceeds a threshold size. Upon determining that the size of the one or more logical data blocks meets or exceeds the threshold size, the node can allocate a segment in a capacity object of the storage object, write the one or more logical data blocks via a full stripe write to the segment, and write metadata for the one or more logical data blocks to a log record in a log of a metadata object of the storage object. The metadata written to the log record can include mappings between logical block addresses (LBAs) of the one or more logical data blocks and physical block addresses (PBAs) where the one or more logical data blocks reside in the segment.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for issuing efficient writes to an erasure coded storage object maintained by a distributed storage system via adaptive logging, the method comprising: receiving, by a node of the distributed storage system, a write request for updating one or more logical data blocks of the storage object; determining, by the node, whether a size of the one or more logical data blocks meets or exceeds a threshold size; and upon determining that the size of the one or more logical data blocks meets or exceeds the threshold size: allocating, by the node, a segment in a capacity object of the storage object; writing, by the node, the one or more logical data blocks via a full stripe write to the segment; and writing, by the node, metadata for the one or more logical data blocks to a log record in a log of a metadata object of the storage object, the metadata include mappings between logical block addresses (LBAs) of the one or more logical data blocks and physical block addresses (PBAs) where the one or more logical data blocks reside in the segment. 2. The method of claim 1 wherein the capacity object is striped across a storage tier of the distributed storage system in accordance with an erasure coding scheme assigned to the storage object, and wherein the threshold size is based on a full stripe size of the capacity object. 3. The method of claim 1 wherein the threshold size is equal to a size of the segment. 4. The method of claim 1 wherein the log record excludes data of the one or more logical data blocks. 5. The method of claim 1 wherein the capacity object is striped across a first storage tier of the distributed storage system in accordance with an erasure coding scheme assigned to the storage object, and wherein the metadata object is mirrored across a second storage tier of the distributed storage system in a manner that enables the metadata object to achieve an equivalent level of fault tolerance as the capacity object. 6. The method of claim 1 wherein the metadata object is created and managed using a overwrite-based file system disk layout, and wherein the capacity object is created and managed using a log-structured file system (LFS) disk layout. 7. The method of claim 1 further comprising, upon determining that the size of the one or more logical data blocks does not meet or exceed the threshold size: writing data and metadata for the one or more logical data blocks to another log record in the log; placing data for the one or more logical data blocks in one or more free slots of an in-memory bank, the in-memory bank being configured to hold a predefined number of stripes of the storage object in accordance with an erasure coding scheme assigned to the storage object; in response to the placing, determining whether the in-memory bank has become full; and if the in-memory bank has become full: computing and filling one or more parity blocks for each stripe of the storage object in the in-memory bank; allocating another segment in the capacity object for holding contents of the in-memory bank; and writing contents of the in-memory bank via a full stripe write to said another segment. 8. A non-transitory computer readable storage medium having stored thereon program code executable by a node in a distributed storage system, the program code embodying a method for issuing efficient writes to an erasure coded storage object maintained by the distributed storage system via adaptive logging, the method comprising: receiving a write request for updating one or more logical data blocks of the storage object; determining whether a size of the one or more logical data blocks meets or exceeds a threshold size; and upon determining that the size of the one or more logical data blocks meets or exceeds the threshold size: allocating a segment in a capacity object of the storage object; writing the one or more logical data blocks via a full stripe write to the segment; and writing metadata for the one or more logical data blocks to a log record in a log of a metadata object of the storage object, the metadata include mappings between logical block addresses (LBAs) of the one or more logical data blocks and physical block addresses (PBAs) where the one or more logical data blocks reside in the segment. 9. The non-transitory computer readable storage medium of claim 8 wherein the capacity object is striped across a storage tier of the distributed storage system in accordance with an erasure coding scheme assigned to the storage object, and wherein the threshold size is based on a full stripe size of the capacity object. 10. The non-transitory computer readable storage medium of claim 8 wherein the threshold size is equal to a size of the segment. 11. The non-transitory computer readable storage medium of claim 8 wherein the log record excludes data of the one or more logical data blocks. 12. The non-transitory computer readable storage medium of claim 8 wherein the capacity object is striped across a first storage tier of the distributed storage system in accordance with an erasure coding scheme assigned to the storage object, and wherein the metadata object is mirrored across a second storage tier of the distributed storage system in a manner that enables the metadata object to achieve an equivalent level of fault tolerance as the capacity object. 13. The non-transitory computer readable storage medium of claim 8 wherein the metadata object is created and managed using a overwrite-based file system disk layout, and wherein the capacity object is created and managed using a log-structured file system (LFS) disk layout. 14. The non-transitory computer readable storage medium of claim 8 wherein the method further comprises, upon determining that the size of the one or more logical data blocks does not meet or exceed the threshold size: writing data and metadata for the one or more logical data blocks to another log record in the log; placing data for the one or more logical data blocks in one or more free slots of an in-memory bank, the in-memory bank being configured to hold a predefined number of stripes of the storage object in accordance with an erasure coding scheme assigned to the storage object; in response to the placing, determining whether the in-memory bank has become full; and if the in-memory bank has become full: computing and filling one or more parity blocks for each stripe of the storage object in the in-memory bank; allocating another segment in the capacity object for holding contents of the in-memory bank; and writing contents of the in-memory bank via a full stripe write to said another segment. 15. A computer system acting as a node in a distributed storage system, the computer system comprising: a processor; and a non-transitory computer readable medium having stored thereon program code that, when executed, causes the processor to: receive a write request for updating one or more logical data blocks of the storage object; determine whether a size of the one or more logical data blocks meets or exceeds a threshold size; and upon determining that the size of the one or more logical data blocks meets or exceeds the threshold size: allocate a segment in a capacity object of the storage object; write the one or more logical data blocks via a full stripe write to the segment; and write metadata for the one or more logical data blocks to a log record in a log of a metadata object of the storage object, the metadata include mappings between logical block addresses (LBAs) of the one or more logical data blocks and physical block addresse

Assignees

Inventors

Classifications

  • by exceeding a count or rate limit, e.g. word- or bit count limit · CPC title

  • by allocating resources to storage systems · CPC title

  • Improving I/O performance · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • in relation to data integrity, e.g. data losses, bit errors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11467746B2 cover?
Techniques for issuing efficient writes to an erasure coded storage object in a distributed storage system via adaptive logging are provided. In one set of embodiments, a node of the system can receive a write request for updating one or more logical data blocks of the storage object and determine whether a size of the one or more logical data blocks meets or exceeds a threshold size. Upon dete…
Who is the assignee on this patent?
Vmware Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/064. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).