Dynamic recovery from a split-brain failure in edge nodes
US-2019173982-A1 · Jun 6, 2019 · US
US10997028B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10997028-B2 |
| Application number | US-202016777687-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 30, 2020 |
| Priority date | Feb 2, 2018 |
| Publication date | May 4, 2021 |
| Grant date | May 4, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The system, devices, and methods disclosed herein relate to a dynamic, robust method for choosing a “winner” in an active-active data storage network. In the systems and methods disclosed herein, two or more intelligent nodes within an active-active data storage network periodically exchange operational parameters in an ongoing negotiation regarding who should be the winner in the event of a communication failure within the network. The winner is chosen dynamically based on the operational parameters. A witness is kept apprised of the winner. In the event of a communication failure between the two nodes, the winner is chosen by the witness based on the most recently negotiated lock file reported by one or both of the nodes.
Opening claim text (preview).
What is claimed is: 1. A method for dynamically assigning a winning node in an active-active data storage network comprising the steps of: sending a first operational parameter from a first node to a second node; sending a second operational parameter from the second node to the first node, the first and second nodes being part of an active-active data storage network; negotiating between the first and second nodes which would be the winner in the event a quality measurement of a communication link between the first and second nodes falls below a threshold value, wherein the negotiation includes evaluating the first and second operational parameters; choosing a winner based on at least the first and second operational parameters; recording a winner in a lock file; and transmitting the lock file to a witness. 2. The method of claim 1 , wherein the first operational parameter and the second operation parameter correspond to an equivalent parameter as between the first node and the second node. 3. The method of claim 1 , wherein the first operational parameter or the second operational parameter is a measure of a cache availability. 4. The method of claim 1 , wherein the first operational parameter is a health measurement for the first node. 5. The method of claim 1 , wherein the second operational parameter is a health measurement for the second node. 6. The method of claim 1 , wherein the first operational parameter is a synchronous communication connection to a third storage node, an asynchronous communication connection to a third storage node, a bias role, a non-bias role, a data replication pathway, a memory board state, or a faulted hardware state for the first node. 7. The method of claim 1 , wherein the second operational parameter is a synchronous communication connection to a third storage node, an asynchronous communication connection to a third storage node, a bias role, a non-bias role, a data replication pathway, a memory board state, or a faulted hardware state, for the second node. 8. A system for managing an active-active distributed data processing network, comprising: a first intelligent storage node and a second intelligent storage node in an active-active network configuration connected via a communication link, wherein the first intelligent storage node and the second intelligent storage node are configured to monitor a quality measurement of the communication link; a witness node communicatively coupled to the first intelligent storage node and the second intelligent storage node; and a processor having logic stored thereon configured to: send a first operational parameter from a first node to a second node; send a second operational parameter from the second node to the first node, the first and second nodes being part of an active-active data storage network; negotiate between the first and second nodes which would be the winner in the event a quality measurement of a communication link between the first and second nodes falls below a threshold value, wherein the negotiation includes evaluating the first and second operational parameters; choose a winner based on at least the first and second operational parameters; record a winner in a lock file; and transmit the lock file to a witness. 9. The system of claim 8 , wherein the first operational parameter and the second operation parameter correspond to an equivalent parameter as between the first node and the second node. 10. The system of claim 8 , wherein the first operational parameter or the second operational parameter is a measure of a cache availability. 11. The system of claim 8 , wherein the first operational parameter is a health measurement for the first node. 12. The system of claim 8 , wherein the second operational parameter is a health measurement for the second node. 13. The system of claim 8 , wherein the first operational parameter is a synchronous communication connection to a third storage node, an asynchronous communication connection to a third storage node, a bias role, a non-bias role, a data replication pathway, a memory board state, or a faulted hardware state for the first node. 14. The system of claim 8 , wherein the second operational parameter is a synchronous communication connection to a third storage node, an asynchronous communication connection to a third storage node, a bias role, a non-bias role, a data replication pathway, a memory board state, or a faulted hardware state for the second node. 15. A non-transitory computer readable storage medium containing software for dynamically assigning a winning node in an active-active data storage network comprising performing the steps of: sending a first operational parameter from a first node to a second node; sending a second operational parameter from the second node to the first node, the first and second nodes being part of an active-active data storage network; negotiating between the first and second nodes which would be the winner in the event a quality measurement of a communication link between the first and second nodes falls below a threshold value, wherein the negotiation includes evaluating the first and second operational parameters; choosing a winner based on at least the first and second operational parameters; recording a winner in a lock file; and transmitting the lock file to a witness. 16. The non-transitory computer readable storage medium of claim 15 , wherein the first operational parameter and the second operation parameter correspond to an equivalent parameter as between the first node and the second node. 17. The non-transitory computer readable storage medium of claim 15 , wherein the first operational parameter or the second operational parameter is a measure of a cache availability. 18. The non-transitory computer readable storage medium of claim 15 , wherein the first operational parameter is a health measurement for the first node. 19. The non-transitory computer readable storage medium of claim 15 , wherein the second operational parameter is a health measurement for the second node. 20. The non-transitory computer readable storage medium of claim 15 , wherein the first operational parameter is a synchronous communication connection to a third storage node, an asynchronous communication connection to a third storage node, a bias role, a non-bias role, a data replication pathway, a memory board state, or a faulted hardware state for the first node, or the second operational parameter is a synchronous communication connection to a third storage node, an asynchronous communication connection to a third storage node, a bias role, a non-bias role, a data replication pathway, a memory board state, or a faulted hardware state for the second node.
using centralised failover control functionality · CPC title
Management of state, configuration or failover · CPC title
Data synchronisation · CPC title
Bidirectional techniques · CPC title
by reconfiguration of node membership · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.