Virtual clusters to provide fault containment
US-9444694-B1 · Sep 13, 2016 · US
US2016147569A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016147569-A1 |
| Application number | US-201414555101-A |
| Country | US |
| Kind code | A1 |
| Filing date | Nov 26, 2014 |
| Priority date | Nov 26, 2014 |
| Publication date | May 26, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A distributed computing system that executes a set of long-lived jobs is described. During operation, each worker process performs the following operations. First, the worker process identifies a set of jobs to be executed and a set of worker processes that can execute the set of jobs. Next, the worker process sorts the set of worker processes based on unique identifiers for the worker processes. Then, the worker process assigns jobs to each worker process in the set of worker processes, wherein approximately the same number of jobs is assigned to each worker process, and jobs are assigned to the worker processes in sorted order. While assigning jobs, the worker process uses an identifier for each worker process to seed a pseudorandom number generator, and then uses the pseudorandom number generator to select jobs for each worker process to execute.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method, comprising: in a distributed computing system, using a set of worker processes to execute a set of jobs that are long-lived, wherein each worker performs the following operations: identifying a set of jobs to be executed; identifying the set of worker processes that are available to execute the set of jobs, wherein each worker process is associated with a unique identifier; and sorting the set of worker processes based on the unique identifiers; assigning one or more jobs to each worker process in the set of worker processes, wherein approximately the same number of jobs is assigned to each worker process, wherein jobs are assigned to the worker processes in sorted order, and wherein assigning the one or more jobs to each worker process involves, using the unique identifier for the worker process to seed a deterministic pseudorandom number generator, and using the seeded deterministic pseudorandom number generator to select the one or more jobs for the worker process to execute. 2 . The computer-implemented method of claim 1 , wherein after a worker process assigns the one or more jobs to each worker process in the set of worker processes, the worker process commences executing the one or more jobs that are assigned to the worker process and, if necessary, stops executing jobs that are no longer assigned to the worker process. 3 . The computer-implemented method of claim 1 , wherein identifying the set of worker processes involves querying a distributed consensus service that operates by sending heartbeats to different machines in the distributed computing system to identify worker processes that are available to execute jobs. 4 . The computer-implemented method of claim 1 , wherein the method further comprises performing initialization operations for each worker process that commences execution, wherein the initialization operations include: registering the worker process with a distributed consensus service to obtain an identifier for the worker process; and subscribing to an online worker list to obtain notifications about changes to the set of worker processes that are available to execute jobs. 5 . The computer-implemented method of claim 1 , wherein using the seeded deterministic pseudorandom number generator to select the one or more jobs for each worker process to execute involves: determining a desired number of jobs that the worker process will execute by dividing a number of jobs in the set of jobs by a number of worker processes in the set of worker processes and, if necessary, adding one to account for a remainder; and repeating the following operations until the desired number of jobs is selected for the worker process to execute, obtaining a random number from the deterministic pseudorandom number generator, using the random number and a modulo operation to randomly select a job from the set of jobs, and if the job has already been selected, repeating the process of obtaining a random number and using the random number to select a job until an unselected job is selected. 6 . The computer-implemented method of claim 1 , wherein each worker process repeats the operations involved in assigning jobs to worker processes whenever the set of worker processes changes. 7 . The computer-implemented method of claim 1 , wherein the operations involved in executing a job are idempotent, thereby allowing multiple worker processes to repeat the operations without violating correctness when jobs are reassigned among worker processes. 8 . The computer-implemented method of claim 1 , wherein the set of jobs that are long-lived includes jobs that process queues of updates and deletes directed to replicated copies of data items located in different zones, wherein each zone comprises a separate geographic storage location. 9 . A non-transitory computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method, the method comprising: in a distributed computing system, using a set of worker processes to execute a set of jobs that are long-lived, wherein each worker process performs the following operations: identifying a set of jobs to be executed; identifying the set of worker processes that are available to execute the set of jobs, wherein each worker process is associated with a unique identifier; sorting the set of worker processes based on the unique identifiers; and assigning one or more jobs to each worker process in the set of worker processes, wherein approximately the same number of jobs is assigned to each worker process, wherein jobs are assigned to the worker processes in sorted order, and wherein assigning the one or more jobs to each worker process involves, using the unique identifier for the worker process to seed a deterministic pseudorandom number generator, and using the seeded deterministic pseudorandom number generator to select the one or more jobs for the worker process to execute. 10 . The non-transitory computer-readable storage medium of claim 9 , wherein after a worker process assigns the one or more jobs to each worker process in the set of worker processes, the worker process commences executing the one or more jobs that are assigned to the worker process and, if necessary, stops executing jobs that are no longer assigned to the worker process. 11 . The non-transitory computer-readable storage medium of claim 9 , wherein identifying the set of worker processes involves querying a distributed consensus service that operates by sending heartbeats to different machines in the distributed computing system to identify worker processes that are available to execute jobs. 12 . The non-transitory computer-readable storage medium of claim 9 , wherein the method further comprises performing initialization operations for each worker process that commences execution, wherein the initialization operations include: registering the worker process with a distributed consensus service to obtain an identifier for the worker process; and subscribing to an online worker list to obtain notifications about changes to the set of worker processes that are available to execute jobs. 13 . The non-transitory computer-readable storage medium of claim 9 , wherein using the seeded deterministic pseudorandom number generator to select the one or more jobs for each worker process to execute involves: determining a desired number of jobs that the worker process will execute by dividing a number of jobs in the set of jobs by a number of worker processes in the set of worker processes and, if necessary, adding one to account for a remainder; and repeating the following operations until the desired number of jobs is selected for the worker process to execute, obtaining a random number from the deterministic pseudorandom number generator, using the random number and a modulo operation to randomly select a job from the set of jobs, and if the job has already been selected, repeating the process of obtaining a random number and using the random number to select a job until an unselected job is selected. 14 . The non-transitory computer-readable storage medium of claim 9 , wherein each worker process repeats the operations involved in assigning jobs to worker processes whenever the set of worker processes changes. 15 . The non-transitory computer-readable storage medium of claim 9 , wherein the operations involved in executing a job are idempotent, thereby allowing multiple worker processes to repeat the operations without violating correctness when jobs are
the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title
Indexing; Data structures therefor; Storage structures · CPC title
considering the load · CPC title
the resource being the memory · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.