Infrastructure driven auto-scaling of workloads
US-2024419470-A1 · Dec 19, 2024 · US
US9596298B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9596298-B1 |
| Application number | US-201314144826-A |
| Country | US |
| Kind code | B1 |
| Filing date | Dec 31, 2013 |
| Priority date | Dec 31, 2013 |
| Publication date | Mar 14, 2017 |
| Grant date | Mar 14, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus are described for load balancing in a distributed processing system. In one aspect, a method includes monitoring, for each data processor in a distributed processing system, a rate of cursor progress for the data processor based on timestamps of data units that have been processed, wherein the rate of cursor progress for each data processor specifies a rate of change of an oldest timestamp of an oldest data unit being processed by the data processor; determining a progress threshold for the distributed processing system based on the respective rates of cursor progress of the data processors; determining, based on a first rate of cursor progress for a first data processor, that the first rate of cursor progress does not meet the progress threshold; and in response to the determination, performing one or more load balancing operations on the distributed processing system.
Opening claim text (preview).
What is claimed is: 1. A method implemented by data processing apparatus, the method comprising: obtaining, by a distributed processing system that includes a plurality of data processors, data units to be processed by the distributed processing system, each data unit having a corresponding timestamp that indicates a time the data unit was created, and wherein the time the data unit was created is prior to a time that the data unit is processed by the data processors; distributing the data units among the data processors for processing by the data processors; monitoring, for each data processor in the distributed processing system, a rate of cursor progress for the data processor based on the timestamps of the data units that have been processed by the data processor, wherein the rate of cursor progress for each data processor specifies a rate of change of an oldest timestamp of an oldest data unit being processed by the data processor based on the rate of change of the times indicated by the respective timestamps of each oldest data unit processed by the data processor; determining a progress threshold for the distributed processing system based on the respective rates of cursor progress processed by the plurality of data processors; determining, based on a first rate of cursor progress for a first data processor processed by the plurality of data processors, that the first rate of cursor progress for the first data processor does not meet the progress threshold; and performing one or more load balancing operations on the distributed processing system in response to determining that the first rate of cursor progress for the first data processor does not meet the progress threshold. 2. The method of claim 1 , wherein determining a progress threshold comprises: determining, based on the rate of cursor progress of each data processor in the distributed processing system, a distribution of cursor progress rates for the distributed processing system; and determining the progress threshold based on the distribution of cursor progress rates for the distributed processing system. 3. The method of claim 2 , further comprising: obtaining cursor progress rate history that specifies a historical distribution of cursor progress rates for the distributed processing system, and wherein the distribution of cursor progress rates for the distributed processing system is determined based on the cursor progress history. 4. The method of claim 2 , wherein the progress threshold is a multiple of a standard deviation of the distribution. 5. The method of claim 2 , further comprising: identifying an overhead for the one or more load balancing operations, the overhead specifying a cursor progress cost of performing the one or more load balancing operations, and wherein the progress threshold is further based on the overhead. 6. The method of claim 1 , wherein the one or more load balancing operations comprise one or more of: instructing one or more other data processors to process one or more data units currently being processed by the first data processor; splitting one or more data units being processed by the first data processor into multiple data units and instructing one or more other data processors to process one or more of the multiple data units; or throttling one or more other data processors of the distributed processing system. 7. A system comprising: a data processing apparatus; and a data store storing instructions that, when executed by the data processing apparatus, cause the data processing apparatus to perform operations comprising: obtaining, by a distributed processing system implemented in the data processing apparatus, the distributed processing system including a plurality of data processors, data units to be processed by the distributed processing system, each data unit having a corresponding timestamp that indicates a time the data unit was created, and wherein the time the data unit was created is prior to a time that the data unit is processed by the data processors; distributing the data units among the data processors in the distributed processing system for processing by the data processors; monitoring, for each data processor in the distributed processing system, a rate of cursor progress for the data processor based on timestamps of data units that have been processed by the data processor, wherein the rate of cursor progress for each data processor specifies a rate of change of an oldest timestamp of an oldest data unit being processed by the data processor based on the rate of change of the times indicated by the respective timestamps of each oldest data unit processed by the data processor; determining a progress threshold for the distributed processing system based on the respective rates of cursor progress processed by the plurality of data processors; determining, based on a first rate of cursor progress for a first data processor processed by the plurality of data processors, that the first rate of cursor progress for the first data processor does not meet the progress threshold; and performing one or more load balancing operations on the distributed processing system in response to determining that the first rate of cursor progress for the first data processor does not meet the progress threshold. 8. The system of claim 7 , wherein determining a progress threshold comprises: determining, based on the rate of cursor progress of each data processor in the distributed processing system, a distribution of cursor progress rates for the distributed processing system; and determining the progress threshold based on the distribution of cursor progress rates for the distributed processing system. 9. The system of claim 8 , wherein the operations further comprise: obtaining cursor progress rate history that specifies a historical distribution of cursor progress rates for the distributed processing system, and wherein the distribution of cursor progress rates for the distributed processing system is determined based on the cursor progress history. 10. The system of claim 8 , wherein the progress threshold is a multiple of a standard deviation of the distribution. 11. The system of claim 8 , wherein the operations further comprise: identifying an overhead for the one or more load balancing operations, the overhead specifying a cursor progress cost of performing the one or more load balancing operations, and wherein the progress threshold is further based on the overhead. 12. The system of claim 7 , wherein the one or more load balancing operations comprise one or more of: instructing one or more other data processors to process one or more data units currently being processed by the first data processor; splitting one or more data units being processed by the first data processor into multiple data units and instructing one or more other data processors to process one or more of the multiple data units; or throttling one or more other data processors of the distributed processing system. 13. A computer readable medium storing instructions that, when executed by a data processing apparatus, cause the data processing apparatus to perform operations comprising: obtaining, by a distributed processing system implemented in the data processing apparatus, the distributed processing system including a plurality of data processors, data units to be processed by the distributed processing system, each data unit having a corresponding timestamp that indicates a time the data unit was created, and wherein the time the data unit was created is prior to a time that the data unit is processed by the data processors; distributing th
based on parameters of servers, e.g. available memory or workload (monitoring of computer activity G06F11/30) · CPC title
Techniques for rebalancing the load in a distributed system · CPC title
considering the load · CPC title
Barrier synchronisation · CPC title
Electricity · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.