System and method for augmenting duplexed replicated computing

US9992010B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9992010-B2
Application numberUS-201514951416-A
CountryUS
Kind codeB2
Filing dateNov 24, 2015
Priority dateNov 24, 2015
Publication dateJun 5, 2018
Grant dateJun 5, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are disclosed herein for a replicated fault-tolerant computer system. The system includes a triplet of network elements, which each maintain a clock signal, and a clock monitor at each network element for monitoring incoming clock signals. Each network interfaces with a fault containment region (FCR). The system provides the ability to transition from a duplex system to a triplex system if one of the previously offline FCRs can be brought back online. The network elements can determine or receive notification that the previously offline FCR can be brought back online, align their respective clock signals, and synchronize the memory state of the previously offline FCR. The system can then operate in a fault-tolerant, replicated triplex operating mode.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for providing replicated fault-tolerant computing configured to operate at least in a duplex mode and a triplex mode, the system comprising at least a first, second, and third network elements, at least one of the network elements including a processor comprising: synchronization logic including: a clock module configured to maintain a clock signal; an alignment module configured to synchronize the clock signal with a received clock signal; a data synchronization module configured to synchronize a memory state of the synchronization logic; a synchronization control module configured to control the alignment module and the data synchronization module; and a clock monitor configured to: receive a clock signal of the first network element, a clock signal of the second network element, and a clock signal of the third network element; transmit, using a switch controller of the clock monitor, a duplicated clock signal to the synchronization logic in place of the clock signal of the third network element; determine that the clock signal of the third network element is valid; and connect, using the switch controller, the clock signal of the third network element to the synchronization logic, wherein the data synchronization module synchronizes synchronization data of the first and/or second network elements with synchronization data of the third network element. 2. The system of claim 1 , wherein the synchronization logic of the first network element is further configured to, upon determining that the clock signal of the third network element is valid, transmit a request to enter a recovery mode to the second network element. 3. The system of claim 2 , wherein the data synchronization module of the first network element and the data synchronization module of the second network element are configured to: transmit synchronization data from each of a plurality of successive memory spaces to a data synchronization module of the third network element; receive echoed synchronization data from the data synchronization module of the third network element; and verify the echoed synchronization data. 4. The system of claim 3 , wherein the data synchronization module of the first network element is further configured to: determine, by the data synchronization module of the first network element, that the echoed synchronization data does not match the synchronization data; transmit, responsive to the determination, a notification to the clock monitor that the third network element is offline; and wherein the clock monitor is further configured to: update, in response to the notification, a status of the third network element to offline in a mode register of the clock monitor; disconnect, using the switch controller, the clock signal of third network element from the alignment module; connect, using the switch controller, the duplicated clock signal to the alignment module; and notify the second network element and the third network element that the third network element is offline. 5. The system of claim 1 , wherein the clock monitor is further configured to determine whether the clock signal of the third network element is valid by: determining that a duty cycle of the clock signal of the third network element falls within a predetermined range; and determining that a frequency of the clock signal of the third network element falls within a predetermined range. 6. The system of claim 5 , wherein the clock monitor is further configured to determine that the clock signal of the third network element is valid by determining that the clock signal of the third network element has been valid for a predetermined period of time. 7. The system of claim 6 , wherein the predetermined period is at least one clock cycle. 8. The system of claim 1 , wherein the clock monitor of the first network element is configured to: detect an invalid clock pulse in the clock signal of the third network element; disconnect, using the switch controller, the clock signal of the third network element from the synchronization logic; connect, using the switch controller, the duplicated clock signal to the synchronization logic; and notify the second network element and the third network element that the third network element is offline. 9. The system of claim 8 , wherein the clock monitor of the first network element is configured to disconnect the clock signal of the third network element from the synchronization logic and connect the duplicated clock signal to the synchronization logic before receiving the next clock signal from a clock module of the first network element. 10. The system of claim 1 , wherein the synchronization logic of the first network element is further configured to receive, from the second network element, a notification that the third network element is offline, and wherein the clock monitor of the first network element is further configured to: disconnect, using the switch controller, the clock signal of the third network element from the synchronization logic; and connect, using the switch controller, the duplicated clock signal to the synchronization logic. 11. The system of claim 1 , wherein the clock monitor of the first network element is further configured to disconnect or ignore the duplicated clock signal upon determining that the clock signal of the third network element is valid. 12. A method for providing replicated fault-tolerant computing among at least a first, second, and third network elements configured to operate at least in a duplex mode and a triplex mode, the method comprising: receiving, at a clock monitor of the first network element, a clock signal of the first network element, a clock signal of the second network element, and a clock signal of the third network element; transmitting, to synchronization logic of the first network element, a duplicated clock signal in place of the clock signal of the third network element; determining, by the clock monitor, that the clock signal of the third network element is valid; connecting, using the switch controller, the clock signal of the third network element to the synchronization logic; and synchronizing synchronization data of the first and/or second network elements with synchronization data of the third network element. 13. The method of claim 12 , further comprising transmitting, by the synchronization logic upon determining that the clock signal of the third network element is valid, a request to enter a recovery mode to the second network element. 14. The method of claim 13 , further comprising: transmitting, by the data synchronization module of the first network element and a data synchronization module of the second network element, synchronization data from each of a plurality of successive memory spaces to a data synchronization module of the third network element; receiving, by the data synchronization module of the first network element and the data synchronization module of the second network element, echoed synchronization data from the data synchronization module of the third network element; and verifying, by the data synchronization module of the first network element and the data synchronization module of the second network element, the echoed synchronization data. 15. The method of claim 14 , further comprising: determining, by the data synchronization module of the first network element, that the echoed synchronization data does not match the synchronization data; transmitting, by the data synchronization module of the first network element, a notification to the clock

Assignees

Inventors

Classifications

  • where the redundant components implement processing functionality · CPC title

  • using a quantum · CPC title

  • Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available (error or fault processing without redundancy G06F11/0703; error detection or correction by redundancy in data representation G06F11/08; error detection or correction of the data by redundancy in operations G06F11/14; error detection or correction by redundancy in hardware G06F11/16) · CPC title

  • in which an application is distributed across nodes in the network (software deployment G06F8/60; multiprogramming arrangements G06F9/46) · CPC title

  • at clock signal level · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9992010B2 cover?
Systems and methods are disclosed herein for a replicated fault-tolerant computer system. The system includes a triplet of network elements, which each maintain a clock signal, and a clock monitor at each network element for monitoring incoming clock signals. Each network interfaces with a fault containment region (FCR). The system provides the ability to transition from a duplex system to a tr…
Who is the assignee on this patent?
Charles Stark Draper Laboratory Inc
What technology area does this patent fall under?
Primary CPC classification H04L7/0008. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jun 05 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).