System and method for event-driven live migration of multi-process applications

US9459971B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9459971-B1
Application numberUS-201514678991-A
CountryUS
Kind codeB1
Filing dateApr 5, 2015
Priority dateAug 26, 2005
Publication dateOct 4, 2016
Grant dateOct 4, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system, method, and computer readable medium for asynchronous live migration of applications between two or more servers. The computer readable medium includes computer-executable instructions for execution by a processing system. Primary applications runs on primary hosts and one or more replicated instances of each primary application run on one or more backup hosts. Asynchronous live migration is provided through a combination of process replication, logging, barrier synchronization, checkpointing, reliable messaging and message playback. The live migration is transparent to the application and requires no modification to the application, operating system, networking stack or libraries.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: one or more memory devices configured to store a primary application executing on a host with a host operating system; one or more interceptors configured to intercept calls from threads of applications to the host operating system, and configured to generate replication messages based on said intercepted calls; a barrier for said primary application that ensures that the replication messages from the primary application correspond to fully finished resource calls, by halting execution, and the corresponding replication messages are synchronized to the entry and exit of the interceptor; a checkpointing service for said primary application configured to checkpoint said primary application; one or more additional memory devices configured to store the one or more backup applications executing on one or more backup hosts each with a corresponding host operating system; one or more interceptors configured to intercept calls to said one or more backup host operating systems; a checkpointing service for each one or more backup applications configured to checkpoint-restore said one or more backup applications; and wherein live migration of said primary application to said one or more backup hosts is performed in response to an event or fault. 2. The system according to claim 1 , wherein said operating system is one of Linux, UNIX or Windows. 3. The system according to claim 1 , wherein said event is one of operator generated live migration event, CPU threshold event, memory threshold event, storage threshold event, SNMP event, or script generated event. 4. The system according to claim 1 , wherein said fault is one of application crash; host crash, operating system fault; memory fault; storage fault, power supply fault, or general device fault. 5. The system according to claim 1 , wherein said barrier is configured to halt execution inside said interceptors. 6. The system according to claim 1 , wherein said barrier is configured to ha execution at the entry or at the exit to said interceptors. 7. The system according to claim 1 , wherein said barrier is configured to halt execution outside said intercepted resource. 8. The system according to claim 1 , wherein the system is configured to choose a backup host based on one of preconfigured backup, operator-chosen backup, or dynamically chosen backup based on available resources. 9. The system according to claim 1 , wherein said event is generated external to the primary application. 10. The system according to claim 1 , wherein the logging facility is configured to write each message to a log set in the log on shared storage, configured to write each checkpoint to a log set in the log on shared storage, configured to create a new log set with each new checkpoint, and configure to remove old log sets when a new log set has been created. 11. The system according to claim 10 , wherein the logging facility is configured to include a pending acknowledgement queue. 12. The system according to claim 10 , wherein said checkpointing service for the primary application is configured to be triggered by one of a certain amount of elapsed time since the last checkpoint, a certain number of replication messages, an operator event, a resource event, or another external event. 13. The system according to claim 10 , wherein a log set is comprised of a checkpoint and all replication messages between said checkpoint and the next checkpoint. 14. The system according to claim 10 , wherein said logging services are configured to store the two most recent log sets on shared storage. 15. The system according to claim 1 , wherein the checkpointing services for the primary application is configured to be triggered by one of a certain amount of elapsed time since the last checkpoint, a certain number of replication messages, an operator event, a resource event, or another external event. 16. The system according to claim 1 , comprising at least one of: a messaging layer for said primary application configured to transmit said replication messages to the one or more backups; a logging facility for said messaging layer; and a messaging layer for each one or more backup applications configured to provide ordered receipt of said replication messages; wherein the messaging layers are configured to transmit said messages over one of UDP, TCP, UDP using multicast, or UDP using broadcast. 17. The system according to claim 16 , wherein the checkpointing service for the primary application is configured to place said checkpoints in replication messages and to assign said checkpoints message IDs. 18. The system according to claim 17 , wherein the logging facility is configured to log said checkpoint replication messages. 19. The system according to claim 17 , wherein the logging facility is configured to store checkpoint replication messages and to not transmit said checkpoint replication messages to the one or more backups. 20. The system according to claim 17 , wherein the logging facility is configured to store checkpoint replication messages and to transmit said checkpoint replication messages to the one or more backups over the messaging layer.

Assignees

Inventors

Classifications

  • using middleware or operating system [OS] functionalities · CPC title

  • G06F9/4856Primary

    resumption being on a different machine, e.g. task migration, virtual machine migration (G06F9/5088 takes precedence) · CPC title

  • Intercept · CPC title

  • with a single idle spare processing component · CPC title

  • Solving problems relating to consistency · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9459971B1 cover?
A system, method, and computer readable medium for asynchronous live migration of applications between two or more servers. The computer readable medium includes computer-executable instructions for execution by a processing system. Primary applications runs on primary hosts and one or more replicated instances of each primary application run on one or more backup hosts. Asynchronous live migra…
Who is the assignee on this patent?
Open Invention Network Llc
What technology area does this patent fall under?
Primary CPC classification G06F9/4856. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 04 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).