Asynchronous distributed coordination and consensus with threshold logical clocks
US-2021018953-A1 · Jan 21, 2021 · US
US11704201B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11704201-B2 |
| Application number | US-202117456993-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 30, 2021 |
| Priority date | Nov 30, 2021 |
| Publication date | Jul 18, 2023 |
| Grant date | Jul 18, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One example method includes performing failure recovery operations in a computing system using matrix clocks. Each node or process in a computing system is associated with a matrix clock. As events and transitions occur in the computing systems, the matrix clocks are updated. The matrix clocks provide a chronological and casual view of the computing system and allow a recovery line to be determined in the event of system failure.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: detecting a first event at a first node in a distributed computing system; updating a first matrix clock associated with the first node; transitioning to a second matrix clock at a second node when the second node experiences an event; updating the second matrix clock associated with the second node; detecting a failure in the computing system; and performing a failure recovery in the distributed computing system based on primary vector of the second matrix clock, wherein the primary vector identifies a failure recovery line. 2. The method of claim 1 , further comprising adding the second node to the distributed computing system and performing a snapshot of the second node, wherein the second event comprises the snapshot. 3. The method of claim 2 , wherein updating the second matrix clock includes updating a principal vector of the second matrix clock to reflect the second event and updating a supporting vector of the second matrix clock to include a status of the first node. 4. The method of claim 1 , wherein the first event comprises a snapshot of the first node. 5. The method of claim 1 , wherein updating the first matrix clock includes updating a principal vector to reflect the first event. 6. The method of claim 5 , wherein the principal vector is updated to include a generational number associated with the event. 7. The method of claim 1 , further comprising detecting a failure at the second node and rolling back to a previous snapshot at the first node based on the principal vector of the second node included in the second matrix clock. 8. The method of claim 7 , further comprising replaying a log from a first snapshot from the first node. 9. The method of claim 8 , further comprising synchronizing the first node and the second node to the first snapshot. 10. The method of claim 9 , further comprising performing a cascaded rollback. 11. A non-transitory storage medium having stored therein instructions that are executable by one or more hardware processors to perform operations comprising: detecting a first event at a first node in a distributed computing system; updating a first matrix clock associated with the first node; transitioning to a second matrix clock at a second node when the second node experiences an event; updating the second matrix clock associated with the second node; detecting a failure in the computing system; and performing a failure recovery in the distributed computing system based on primary vector of the second matrix clock, wherein the primary vector identifies a failure recovery line. 12. The non-transitory storage medium of claim 11 , further comprising adding the second node to the distributed computing system and performing a snapshot of the second node, wherein the second event comprises the snapshot. 13. The non-transitory storage medium of claim 12 , wherein updating the second matrix clock includes updating a principal vector of the second matrix clock to reflect the second event and updating a supporting vector of the second matrix clock to include a status of the first node. 14. The non-transitory storage medium of claim 11 , wherein the first event comprises a snapshot of the first node. 15. The non-transitory storage medium of claim 11 , wherein updating the first matrix clock includes updating a principal vector to reflect the first event. 16. The non-transitory storage medium of claim 15 , wherein the principal vector is updated to include a generational number associated with the event. 17. The non-transitory storage medium of claim 11 , further comprising detecting a failure at the second node and rolling back to a previous snapshot at the first node based on the principal vector of the second node included in the second matrix clock. 18. The non-transitory storage medium of claim 17 , further comprising replaying a log from a first snapshot from the first node. 19. The non-transitory storage medium of claim 18 , further comprising synchronizing the first node and the second node to the first snapshot. 20. The non-transitory storage medium of claim 19 , further comprising performing a cascaded rollback.
Restarting or rejuvenating · CPC title
with more than one idle spare processing component · CPC title
with a single idle spare processing component · CPC title
where the redundant components share neither address space nor persistent storage · CPC title
Data re-synchronization of a redundant component, or initial sync of replacement, additional or spare unit · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.