Distributed computing fault management
US-9274902-B1 · Mar 1, 2016 · US
US9519553B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9519553-B2 |
| Application number | US-201514985740-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 31, 2015 |
| Priority date | Dec 31, 2014 |
| Publication date | Dec 13, 2016 |
| Grant date | Dec 13, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A failure resistant distributed computing system includes primary and secondary datacenters each comprising a plurality of computerized servers. A control center selects orchestrations from a predefined list and transmits the orchestrations to the datacenters. Transmitted orchestrations include less than all machine-readable actions necessary to execute the orchestrations. The datacenters execute each received orchestration by referencing a full set of actions corresponding to the received orchestration as previously stored or programmed into the computerized server and executing the referenced full set of actions. At least one of the orchestrations comprises a failover operation from the primary datacenter to the secondary datacenter. Failover shifts performance of task from a set of processing nodes of the primary datacenter to a set of processing nodes of the secondary datacenter, such tasks including managing storage accessible by one or more remote clients and running programs on behalf of remote clients.
Opening claim text (preview).
What is claimed is: 1. A failure resistant network-based distributed computing system with a plurality of datacenters comprising: primary and secondary datacenters, each datacenter comprising: a computerized server comprising: a processor configured to execute a processing node; and a memory comprising instructions executable by the processor; and a control center comprising: a memory; and a transmitter, wherein the control center is programmed to perform machine-executable operations stored in its memory to: select orchestrations from a predefined list stored in its memory; and transmit, using the transmitter, an identification of the selected orchestrations to at least one of the computerized servers of the primary or secondary datacenters; and wherein: the at least one computerized server of the primary and secondary datacenters is programmed to perform machine-executable operations to, responsive to receiving identification of one of the selected orchestrations from the control center, transmit, to the control center, a request for an updated list of machine-executable actions necessary to execute an identified orchestration, and execute the identified orchestration using its processor by referencing a set of actions corresponding to the identified orchestration as previously stored or programmed into the computerized server and executing the set of actions on the server processor modified by the updated list; at least one of the machine-executable actions is to direct at least one other computerized server to execute prescribed tasks on its processor; and the predefined list of orchestrations comprises at least one machine-executable orchestration to conduct a failover operation from the primary datacenter to the secondary datacenter, the failover operation comprising shifting performance of tasks from at least one processing node of the primary datacenter to at least one processing node of the secondary datacenter, the tasks comprising: managing storage accessible by one or more clients located remotely from the datacenters; and running programs of machine-implemented operations on behalf of clients remotely located from the datacenters. 2. The system of claim 1 , wherein at least one of the machine-executable actions corresponding to the identified orchestration comprises a given computerized server transferring control of execution on its processor of other actions corresponding to the identified orchestration to a different computerized server processor than the given computerized server processor. 3. The system of claim 2 , wherein the action of the given computerized server transferring control comprises executing at least one of the actions on the processor corresponding to the identified orchestration on the different computerized server. 4. The system of claim 1 , wherein at least one given machine-executable action corresponding to an orchestration comprises transferring control of execution of future actions corresponding to the orchestration to a different computerized server than is performing the given machine-executable action. 5. The system of claim 1 , wherein: the request includes a first version of a set of actions necessary to execute the received orchestration; and the control center further comprises instructions executable by its processor to perform operations comprising: comparing the first version against a second version of a set of actions necessary to execute the received orchestration maintained by the control center; and responsive to detecting one or more differences between the first version and second version, transmitting data representing changes between the first version and the second version to the at least one computerized server. 6. A failure resistant network-based distributed computing system with a plurality of datacenter, comprising: primary and secondary datacenters, each datacenter comprising: a plurality of computerized servers, each of the computerized servers comprising: a processor; a communications port connected to a network; a memory comprising instructions executable by the processor; and a messaging queue connected via the communications port with the computerized servers of the datacenter; wherein: the processor is configured to execute a processing node; and the messaging queues of the primary and secondary datacenters are communicatively interconnected via their respective communication ports by one or more links; the system further comprising: a control center comprising: one or more digital data processing machines; a communications port; a memory; and a transmitter that communicates via signals sent over its communications port coupled to the at least one messaging queue of each datacenter, wherein the control center is programmed to perform machine-executable operations stored in its memory to: select orchestrations from a predefined list stored in its memory; and transmit, using the transmitter, an identification of the selected orchestrations to a server of the computerized servers of the primary or secondary datacenters via a respective one of the messaging queues; and wherein: each of the computerized servers of the primary and secondary datacenters is programmed to perform machine-executable operations to, responsive to receiving identification of one of the selected orchestrations from the control center via one of the messaging queues, execute the identified orchestration using its processor by referencing a full set of actions corresponding to the received orchestration as previously stored or programmed into the computerized server and executing the referenced full set of actions on the server processor; and at least one of the machine-executable actions is to direct at least one other computerized server to execute prescribed tasks on its processor; the predefined list of orchestrations comprises at least one machine-executable orchestration to conduct a failover operation from the primary datacenter to the secondary datacenter, the failover operation comprising shifting performance of tasks from a set of processing nodes of the primary datacenter to a set of processing nodes of the secondary datacenter, the tasks comprising: managing storage accessible by one or more clients located remotely from the datacenters; and running programs of machine-implemented operations on behalf of clients remotely located from the datacenters; and at least one of the computerized servers comprises instructions executable by its processor to perform operations comprising: from the control center, receiving a differences list corresponding to a given orchestration; and performing the given orchestration by executing an amended set of predefined machine-executable actions, the amended set of predefined machine-executable actions comprising the full set of predefined machine-executable actions necessary to execute the given orchestration as stored or programmed into the memory of the computerized server and further as amended according to the differences list. 7. The system of claim 1 , where at least a first one of the orchestrations in the predefined list corresponds to one or more machine-executable actions to override one or more of a full set of machine-executable actions stored or programmed into one or more of the computerized servers. 8. The system of claim 1 , wherein the at least one computerized server further comprises instructions executable by its processor such that the set of predefined machine-executable actions are stored or programmed into the memory of the computerized servers by: storage accessible by the computerized servers; incorporation into circuitry of the computerized servers; or incorporation into
with a single idle spare processing component · CPC title
where the redundant components share neither address space nor persistent storage · CPC title
using centralised failover control functionality · CPC title
in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems · CPC title
Reaction to server failures by a load balancer · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.