Dynamic application instance discovery and state management within a distributed system
US-9838240-B1 · Dec 5, 2017 · US
US10652076B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10652076-B2 |
| Application number | US-201715831115-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 4, 2017 |
| Priority date | Dec 29, 2005 |
| Publication date | May 12, 2020 |
| Grant date | May 12, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Dynamic application instance discovery and state management within a distributed system. A distributed system may implement application instances configured to perform one or more application functions within the distributed system, and discovery and failure detection daemon (DFDD) instances, each configured to store an indication of a respective operational state of each member of a respective group of the number of application instances. Each of the DFDD instances may repeatedly execute a gossip-based synchronization protocol with another one of the DFDD instances, where execution of the protocol between DFDD instances includes reconciling differences among membership of the respective groups of application instances. A new application instance may be configured to notify a particular DFDD instance of its availability to perform an application function. The particular DFDD instance may be configured to propagate the new instance's availability to other DFDD instances via execution of the synchronization protocol, without intervention on the part of the new application instance.
Opening claim text (preview).
What is claimed is: 1. A distributed system, comprising: a plurality of computing devices configured to implement: a plurality of application instances configured to perform one or more functions; and a plurality of discovery and failure detection daemon (DFDD) instances, wherein the plurality of DFDD instances are configured to store state information for the plurality of application instances, wherein at least one of the DFDD instances is configured to update, based at least in part on information received from a respective application instance, the state information according to a state machine defining transitions between a plurality of states including a state indicating the respective application instance is newly online, a state indicating the respective application instance is operating normally, a state indicating the respective application instance has lost communication with a respective DFDD instance, a state indicating the respective application instance has failed, and a state indicating the respective application instance is subject to a network split; wherein at least one of the plurality of DFDD instances is configured to execute a peer-to-peer, gossip-based synchronization protocol with a peer instance selected from among the plurality of DFDD instances, and to execute the protocol, the peer DFDD instances are configured to exchange state information for at least one of the plurality of application instances. 2. The distributed system as recited in claim 1 , wherein the plurality of application instances includes two or more different types of application instances, each type of application instance configured to perform one or more different functions of the distributed system, and wherein the state information includes global state information common to all types of application instances and specific state information for at least one type of application instance. 3. The distributed system as recited in claim 1 , wherein the state information for a given application instance includes information indicating a physical or logical location of the application instance in the distributed system. 4. The distributed system as recited in claim 1 , wherein at least one of the plurality of application instances is configured to report its status to a DFDD instance at regular or irregular intervals, wherein the DFDD instance is configured to update global state information indicating whether the respective application instance is operating normally or is in an abnormal state according to the status reports of the application instance. 5. The distributed system as recited in claim 1 , wherein each DFDD instance is one of a daemon process configured to operate within an operating system environment or an autonomous hardware or software agent configured to operate independently from an operating system environment. 6. A method, comprising: storing, by a plurality of discovery and failure detection daemon (DFDD) instances implemented on a plurality of computing devices, state information for a plurality of application instances configured to perform one or more functions in a distributed system; updating, by one or more of the plurality of DFDD instances, and based at least in part on information received from a respective application instance, the state information according to a state machine defining transitions between a plurality of states including a state indicating the respective application instance is newly online, a state indicating the respective application instance is operating normally, a state indicating the respective application instance has lost communication with a respective DFDD instance, a state indicating the respective application instance has failed, and a state indicating the respective application instance is subject to a network split; selecting, by at least one of the plurality of DFDD instances, a peer DFDD instance of the plurality of DFDD instances; and communicating, by the at least one of the plurality of DFDD instances, state information for one or more of the plurality of application instances to the peer DFDD instance according to a peer-to-peer synchronization protocol. 7. The method as recited in claim 6 , further comprising iteratively performing the selecting and the communicating. 8. The method as recited in claim 6 , wherein the synchronization protocol is a gossip-based synchronization protocol. 9. The method as recited in claim 6 , wherein, in the communicating, the at least one of the plurality of DFDD instances exchanges state information for the one or more of the plurality of application instances with the other DFDD instance according to the synchronization protocol. 10. The method as recited in claim 6 , wherein the plurality of application instances includes two or more different types of application instances, each type of application instance configured to perform one or more different functions of the distributed system. 11. The method as recited in claim 10 , wherein the state information includes global state information common to all types of application instances. 12. The method as recited in claim 10 , wherein the state information includes specific state information for at least one type of application instance. 13. The method as recited in claim 6 , wherein the state information for a given application instance includes information for accessing the application instance by clients of the distributed system. 14. The method as recited in claim 6 , wherein the state information for a given application instance includes information indicating a physical or logical location of the application instance in the distributed system. 15. The method as recited in claim 6 , wherein the state information includes global state information for at least one application instance that indicates whether the respective application instance is operating normally or is in an abnormal state. 16. The method as recited in claim 15 , further comprising one or more of the application instances each periodically or aperiodically reporting its status to at least one of the plurality of DFDD instances, wherein the global state information for the respective application instance is updated according to the reported status. 17. The method as recited in claim 6 , wherein the state information for at least one application instance includes information indicating that the application instance is a new application instance to be added to the distributed system. 18. The method as recited in claim 6 , wherein the selecting and the communicating are performed among two or more of the plurality of DFDD instances that are configured to store state information for a respective group of the plurality of application instances. 19. A non-transitory computer-accessible storage medium storing instructions that when executed by a computer implement a discovery and failure detection daemon (DFDD) configured to: store state information for at least one of a plurality of application instances configured to perform one or more functions in a distributed system; update, based at least in part on information received from a respective application instance, the state information according to a state machine defining transitions between a plurality of states including a state indicating the respective application instance is newly online, a state indicating the respective application instance is operating normally, a state indicating the respective application instance has lost communication with a respective DFDD instance, a state i
Distributed indices · CPC title
Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title
De-duplication techniques · CPC title
implemented as replicated file system · CPC title
in which an application is distributed across nodes in the network (software deployment G06F8/60; multiprogramming arrangements G06F9/46) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.