Failure recovery in a scaleout system using a matrix clock
US-11704201-B2 · Jul 18, 2023 · US
US12461826B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12461826-B2 |
| Application number | US-202318464726-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 11, 2023 |
| Priority date | Sep 11, 2023 |
| Publication date | Nov 4, 2025 |
| Grant date | Nov 4, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Determining a failure recovery line in a distributed scaleout computing system. Each node or process of a distributed system has or is associated with a vector clock that includes a logical clock for each node in the distributed system. When failure is detected, a recovery operation may be performed using the vector clock. After the recovery operation, the vector clock is updated such that the failure recovery line is available in the computing system.
Opening claim text (preview).
What is claimed is: 1 . A method comprising: associating a vector clock with each node in a distributed scaleout system, wherein each of the vector clocks includes a home logical clock for a corresponding node and one or more away logical clocks for other nodes in the distributed Scaleout system; when an event occurs at a node, updating a vector clock associated with the node, wherein updating the vector clock includes, when the event is an internal event, updating the home logical clock in the vector clock of the node that experienced the internal event; determining a failure in the distributed scaleout system; performing a rollback operation using the vector clocks, wherein the vector clocks identify a failure recovery line for recovering the distributed scaleout system from the failure; and performing a recovery operation based on the failure recovery line. 2 . The method of claim 1 , wherein the event is an external event, further comprising updating a home logical clock in the vector clock of the node and updating the vector clock of the node based using a vector clock of a sending node that sent the event to the node. 3 . The method of claim 1 , further comprising performing a cascaded rollback operation using the vector clocks. 4 . The method of claim 3 , wherein the recovery operation includes recovering from a snapshot and replaying logs, wherein the logs store a chronological history of events in the distributed scaleout system. 5 . The method of claim 4 , further comprising updating the vector clocks such that the vector clocks include a failure recovery line that accounts for the recovery operation. 6 . The method of claim 5 , wherein some entries in the vector clocks are deleted and replaced with new entries. 7 . The method of claim 1 , further comprising scaling the distributed computing system, wherein a vector clock of a new node added to the distributed computing system is initialized to zeros. 8 . The method of claim 1 , further comprising moving multiple nodes forward from a recovery line after using the failure recovery line to identify the recovery line. 9 . A non-transitory storage medium having stored therein instructions that are executable by one or more hardware processors to perform operations comprising: associating a vector clock with each node in a distributed scaleout system, wherein each of the vector clocks includes a home logical clock for a corresponding node and one or more away logical clocks for other nodes in the distributed Scaleout system; when an event occurs at a node, updating a vector clock associated with the node, wherein updating the vector clock includes, when the event is an internal event, updating the home logical clock in the vector clock of the node that experienced the internal event; determining a failure in the distributed scaleout system; performing a rollback operation using the vector clocks, wherein the vector clocks identify a failure recovery line for recovering the distributed scaleout system from the failure; and performing a recovery operation based on the failure recovery line. 10 . The non-transitory storage medium of claim 9 , wherein the event is an external event, further comprising updating a home logical clock in the vector clock of the node and updating the vector clock of the node based using a vector clock of a sending node that sent the event to the node. 11 . The non-transitory storage medium of claim 9 , further comprising performing a cascaded rollback operation using the vector clocks. 12 . The non-transitory storage medium of claim 11 , wherein the recovery operation includes recovering from a snapshot and replaying logs, wherein the logs store a chronological history of events in the distributed scaleout system. 13 . The non-transitory storage medium of claim 12 , further comprising updating the vector clocks such that the vector clocks include a failure recovery line that accounts for the recovery operation. 14 . The non-transitory storage medium of claim 13 , wherein some entries in the vector clocks are deleted and replaced with new entries. 15 . The non-transitory storage medium of claim 9 , further comprising scaling the distributed computing system, wherein a vector clock of a new node added to the distributed computing system is initialized to zeros. 16 . The non-transitory storage medium of claim 9 , further comprising moving multiple nodes forward from a recovery line after using the failure recovery line to identify the recovery line. 17 . A method comprising: associating a vector clock with each node in a distributed scaleout system, wherein each of the vector clocks includes a home logical clock for a corresponding node and one or more away logical clocks for other nodes in the distributed Scaleout system; when an event occurs at a node, updating a vector clock associated with the node, wherein updating the vector clock includes, when the event is an external event, updating a home logical clock in the vector clock of the node and updating the vector clock of the node based using a vector clock of a sending node that sent the event to the node; determining a failure in the distributed scaleout system; performing a rollback operation using the vector clocks, wherein the vector clocks identify a failure recovery line for recovering the distributed scaleout system from the failure; and performing a recovery operation based on the failure recovery line.
in which an application is distributed across nodes in the network (software deployment G06F8/60; multiprogramming arrangements G06F9/46) · CPC title
Error detection or correction by redundancy in data representation, e.g. by using checking codes · CPC title
Backup scheduling policy · CPC title
Backup restoration techniques · CPC title
involving logging of persistent data for recovery · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.