Decommissioning, re-commissioning, and commissioning new metadata nodes in a working distributed data storage system

US11570243B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11570243-B2
Application numberUS-202117465691-A
CountryUS
Kind codeB2
Filing dateSep 2, 2021
Priority dateSep 22, 2020
Publication dateJan 31, 2023
Grant dateJan 31, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In a running distributed data storage system that actively processes I/Os, metadata nodes are commissioned and decommissioned without taking down the storage system and without introducing interruptions to metadata or payload data I/O. The inflow of reads and writes continues without interruption even while new metadata nodes are in the process of being added and/or removed and the strong consistency of the system is guaranteed. Commissioning and decommissioning nodes within the running system enables streamlined replacement of permanently failed nodes and advantageously enables the system to adapt elastically to workload changes. An illustrative distributed barrier logic (the “view change barrier”) controls a multi-state process that controls a coordinated step-wise progression of the metadata nodes from an old view to a new normal. Rules for I/O handling govern each state until the state machine loop has been traversed and the system reaches its new normal.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for decommissioning metadata nodes within a working distributed data storage system that comprises a plurality of storage service nodes, the method comprising: by a first metadata node, receiving read requests and write requests for metadata that is associated with a first range of keys within a set of keys, wherein the first metadata node comprises a first storage service node that executes a metadata subsystem of the distributed data storage system, wherein the set of keys are unique identifiers that ensure strong consistency within the distributed data storage system, wherein each key of the set is owned by exactly one metadata node in the distributed data storage system, wherein the first metadata node: owns the first range of keys, and stores and maintains first metadata files at the first storage service node, and wherein each first metadata file is associated with the first range of keys; by a second metadata node at a second storage service node that is distinct from the first storage service node, receiving read requests and write requests for metadata that is associated with a second range of keys within the set, wherein the second range is distinct from the first range, wherein the second metadata node: owns the second range of keys, and stores and maintains second metadata files at the second storage service node, wherein each second metadata file is associated with the second range of keys, and wherein the second metadata node comprises the second storage service node that executes the metadata subsystem of the distributed data storage system; executing a distributed barrier logic at one of the plurality of storage service nodes, wherein the distributed barrier logic controls a decommissioning of the second metadata node within the distributed data storage system without interrupting servicing of read requests from and write requests to any of the plurality of storage service nodes, wherein the decommissioning re-distributes ownership of the set of keys among metadata nodes in the distributed data storage system; after the decommissioning of the second metadata node is complete, receiving, at the first metadata node, read requests and write requests for metadata associated with at least some keys in the second range of keys, wherein second metadata files associated with the at least some keys of the second range are stored at the first storage service node and maintained by the first metadata node; and wherein after the decommissioning of the second metadata node is complete, the second metadata node receives no read requests and no write requests within the distributed data storage system. 2. The method of claim 1 , wherein a first instance of the distributed barrier logic is synchronized with other instances of the distributed barrier logic in the distributed data storage system, and wherein each instance of the distributed barrier logic executes in a pod subsystem that is distinct from the metadata subsystem that executes in the first metadata node and in the second metadata node. 3. The method of claim 1 , wherein during the decommissioning of the second metadata node, the first metadata node becomes owner of the at least some keys of the second range and the second metadata node no longer owns the keys in the second range of keys. 4. The method of claim 1 , wherein the distributed barrier logic controls the decommissioning of the second metadata node by applying a state machine to control a progression of operations at the first metadata node and at the second metadata node without causing interruptions to servicing of read requests and write requests addressed to metadata files associated with the second range. 5. The method of claim 1 , wherein the decommissioning of the second metadata node comprises copying the second metadata files associated with the at least some keys of the second range to the first metadata node, and wherein the copying is performed by anti-entropy logic that executes in at least the first metadata node. 6. The method of claim 1 , wherein the decommissioning of the second metadata node within the distributed data storage system is completed when (i) all read requests addressed to metadata files associated with the at least some of the keys in the second range are served by the first metadata node and not by the second metadata node, and (ii) all write requests addressed to metadata files associated with the at least some of the keys in the second range are serviced by the first metadata node and not by the second metadata node. 7. The method of claim 1 further comprising: after (i) all read requests addressed to metadata files associated with the at least some of the keys in the second range are served by the first metadata node and not by the second metadata node, and (ii) all write requests addressed to metadata files associated with the at least some of the keys in the second range are serviced by the first metadata node and not by the second metadata node: removing metadata files associated with the at least some of the keys in the second range from one or more of: the second metadata node and storage service nodes among the plurality that are not associated with the at least some of the keys in the second range. 8. The method of claim 1 , wherein after the decommissioning of the second metadata node is complete, a storage identifier that uniquely identifies the second metadata node in the distributed data storage system is permanently retired. 9. The method of claim 8 , further comprising: re-commissioning the second metadata node, at the second storage service node, into the distributed data storage system with a new storage identifier that is distinct from the storage identifier used by the second metadata node that was decommissioned. 10. The method of claim 1 , wherein the decommissioning of the second service node includes re-distributing second metadata files associated with the second range that are stored at other storage service nodes that are distinct from the first storage service node and the second storage service node. 11. The method of claim 1 , wherein payload data tracked by the second metadata files associated with the second range are not moved in the decommissioning. 12. A distributed data storage system comprising: a plurality of storage service nodes; a first metadata node that is configured to receive read requests and write requests for metadata that is associated with a first range of keys within a set of keys, wherein the first metadata node comprises a first storage service node that executes a metadata subsystem of the distributed data storage system, wherein the set of keys are unique identifiers that ensure strong consistency within the distributed data storage system, wherein each key of the set is owned by exactly one metadata node in the distributed data storage system, wherein the first metadata node: owns the first range of keys, and stores and maintains first metadata files at the first storage service node, and wherein each first metadata file is associated with the first range of keys; a second metadata node at a second storage service node that is distinct from the first storage service node, which is configured to receive read requests and write requests for metadata that is associated with a second range of keys within the set, wherein the second range is distinct from the first range, wherein the second metadata node: owns the second range of keys, and stores and maintains second metadata files at the second storage service node, wherein each second metadata file is associated with one of the keys in the second range of keys, an

Assignees

Inventors

Classifications

  • Departure or maintenance mechanisms · CPC title

  • Distributed file systems · CPC title

  • Memory management, e.g. access or allocation · CPC title

  • Ensuring data consistency and integrity · CPC title

  • Joining mechanisms · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11570243B2 cover?
In a running distributed data storage system that actively processes I/Os, metadata nodes are commissioned and decommissioned without taking down the storage system and without introducing interruptions to metadata or payload data I/O. The inflow of reads and writes continues without interruption even while new metadata nodes are in the process of being added and/or removed and the strong consi…
Who is the assignee on this patent?
Commvault Systems Inc
What technology area does this patent fall under?
Primary CPC classification H04L67/1046. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jan 31 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).