Interconnect path failover
US-2015309892-A1 · Oct 29, 2015 · US
US11301144B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11301144-B2 |
| Application number | US-201916457095-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 28, 2019 |
| Priority date | Dec 28, 2016 |
| Publication date | Apr 12, 2022 |
| Grant date | Apr 12, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A data storage system includes multiple head nodes and data storage sleds. A control plane of the data storage system designates, for a volume partition, one of the head nodes to function as a primary head node storing a primary replica of the volume partition and designates two or more other head nodes to function as reserve head nodes storing reserve replicas of the volume partition. Additionally, the primary head node causes volume data for the volume partition to be erasure encoded and stored on multiple mass storage devices in different ones of the data storage sleds.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: receiving a write request for a volume partition, by a head node of a data storage system acting as a primary head node for the volume partition; writing, by the head node, data included in the write request to a storage of the head node; causing, by the head node, the data included with the write request to be replicated from the head node to a set of two or more other head nodes of the data storage system acting as reserve head nodes for the volume partition; receiving, by the head node, a plurality of additional write requests for the volume partition and performing, for the additional write requests, said writing data included in the additional write requests to the storage of the head node and said causing data included in the additional write requests to be replicated to the set of two or more head nodes; providing an acknowledgement of the write request subsequent to the data being replicated to the two or more reserve head nodes; and erasure encoding respective parts of the data included in the write request and the additional write requests that is stored in the storage of the head node and causing the erasure encoded respective parts of the data to be stored in a plurality of mass storage devices of the data storage system, wherein the acknowledgement is provided asynchronously with respect to the respective parts of the data being erasure encoded. 2. The method of claim 1 , comprising: measuring write latencies of the set of two or more head nodes acting as the reserve head nodes for the volume partition; and in response to a write latency for one of the set of two or more head nodes exceeding a first write latency threshold: reducing a membership of the reserve head nodes required to acknowledge replication of write data to the head node acting as the primary head node before acknowledging the write request to a client of the data storage system. 3. The method of claim 1 , comprising: in response to a write latency for the one of the set of two or more head nodes exceeding a second write latency threshold or a time threshold in a reduce membership state: designating an additional head node of the data storage system as a replacement reserve head node for the volume partition; and initiating a re-mirroring operation to re-mirror volume partition data to the replacement reserve head node. 4. The method of claim 1 , wherein erasure encoding the respective parts of the data comprises: generating striped columns of the data stored in the head node acting as the primary head node for the volume partition; and generating parity columns of the data stored in the head node acting as the primary head node for the volume partition, wherein the striped columns and parity columns comprise fewer copies of the data for the volume partition than are stored in the head node acting as the primary head node for the volume partition and the set of two or more head nodes acting as the reserve head nodes for the volume partition. 5. The method of claim 4 , further comprising: receiving, by the head node acting as the primary head node for the volume partition, an indication that the head node has been designated as a replacement reserve head node for another volume partition; and replicating data stored in a storage of a remaining primary head node for the other volume partition to the storage of the head node. 6. The method of claim 1 , further comprising: receiving an indication of one or more durability requirements for the volume partition from a client of the data storage system; and adjusting a number of reserve head nodes included in the set of two or more reserve head nodes to which write data is replicated based at least in part on the received one or more durability requirements for the volume partition. 7. A data storage system, comprising: a head node of the data storage system; wherein, based, at least in part, on receiving a write request for a volume partition, the head node, when acting as a primary head node of the data storage system for the volume partition, is configured to: write data included in the write request to a storage of the head node; cause the data included with the write request to be replicated from the head node to a set of two or more other head nodes of the data storage system, wherein the two or more other head nodes are acting as reserve head nodes for the volume partition; wherein the head node, when acting as the primary head node of the data storage system for the volume partition, is further configured to: erasure encode respective parts of the data stored in the storage of the head node for the volume partition and cause the erasure encoded respective parts of the data to be stored in a plurality of respective mass storage devices of the data storage system; and provide an acknowledgement of the write request subsequent to the data being replicated to the two or more reserve head nodes, wherein the head node is configured to provide the acknowledgement prior to the respective parts of the data being erasure encoded. 8. The data storage system of claim 7 , wherein for another volume partition stored in the data storage system, the head node is configured to: receive an indication that the head node has been designated as a replacement reserve head node for the other volume partition; and replicate data stored in a storage of a remaining primary head node for the other volume partition to the storage of the head node. 9. The data storage system of claim 7 , wherein the head node is configured to implement, at least in part, a control plane for the data storage system, wherein the control plane is configured to: measure write latencies with respect to the two or more other head nodes acting as reserve head nodes for the volume partition; and in response to a write latency for one of the reserve head nodes exceeding a write latency threshold: designate an additional head node of the data storage system as a replacement head node for the head node with the write latency that exceeds the write latency threshold; and initiate a re-mirroring operation to re-mirror volume partition data to the replacement head node. 10. The data storage system of claim 7 , wherein the head node is configured to implement, at least in part, a control plane for the data storage system, wherein the control plane is configured to: receive in indication of one or more durability requirements for the volume partition from a client of the data storage service; and adjust a number of reserve head nodes included in the set of two or more reserve head nodes to which write data is replicated. 11. The data storage system of claim 10 , wherein in response to receiving the indication of the one or more durability requirements for the volume, the control plane is also configured to: adjust the erasure encoding such that the number of parts of the data stored in the storage of the head node is stored on more or fewer of the mass storage devices of the data storage system, based, at least in part, on the one or more durability requirements for the volume partition received from the client. 12. The data storage system of claim 11 , wherein the head node is configured to store another volume partition with a lower durability requirement than the volume partition, wherein for the other volume partition, the head node is configured to: write data included in a write request for the other volume partition to a storage of the head node; and cause the data included with the write request for the other volume partition to be replicated from the head node to a set of head nodes compri
Replication mechanisms · CPC title
by allocating resources to storage systems · CPC title
in relation to data integrity, e.g. data losses, bit errors · CPC title
in relation to response time · CPC title
Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.