What technology area does this patent fall under?

Primary CPC classification G06F9/5083. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 30 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Self-adaptive control system for dynamic capacity management of latency-sensitive application servers

US9667498B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9667498-B2
Application number	US-201414450148-A
Country	US
Kind code	B2
Filing date	Aug 1, 2014
Priority date	Dec 20, 2013
Publication date	May 30, 2017
Grant date	May 30, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A self-adaptive control system based on proportional-integral (PI) control theory for dynamic capacity management of latency-sensitive application servers (e.g., application servers associated with a social networking application) are disclosed. A centralized controller of the system can adapt to changes in request rates, changes in application and/or system behaviors, underlying hardware upgrades, etc., by scaling the capacity of a cluster up or down so that just the right amount of capacity is maintained at any time. The centralized controller uses information relating to a current state of the cluster and historical information relating to past state of the cluster to predict a future state of the cluster and use that prediction to determine whether to scale up or scale down the current capacity to reduce latency and maximize energy savings. A load balancing system can then distribute traffic among the servers in the cluster using any load balancing methods.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a processor and memory; a controller deployed in a cluster, the controller being a part of a self-adaptive feedback control system, the controller configured to periodically: determine a current value of an operating parameter based on values of the operating parameter from servers in an active mode in the cluster, wherein an amount of servers in the active mode represents a current capacity, determine a total request rate for the cluster, determine a change in a per server request rate to cause a next value of the operating parameter to approach a target value of the operating parameter, wherein the change in the per server request rate is determined as a function of at least the current value of the operating parameter, a value of the per server request rate in a previous control cycle of the self-adaptive feedback system, and control parameters, wherein the per server request rate is different from the total request rate, and determine, based at least in part on the change in per server request rate and the total request rate, a required capacity, wherein the controller is further configured to adjust the current capacity to match the required capacity for optimization of power and latency; and a load balancer configured to distribute request traffic among servers that remain in the active mode in the cluster following adjustment of the current capacity to match the required capacity. 2. The system of claim 1 , wherein to adjust the current capacity to match the required capacity, the controller is further configured to: compare the current capacity to the required capacity to determine an excess capacity; and deallocate an amount of servers corresponding to the excess capacity from the active mode to an inactive mode to decrease the current capacity to match the required capacity. 3. The system of claim 1 , wherein to adjust the current capacity to match the required capacity, the controller is configured to: compare the current capacity to the required capacity to determine an insufficient capacity; and allocate an amount of servers corresponding to the insufficient capacity to the active mode from an inactive mode to increase the current capacity to match the required capacity. 4. The system of claim 1 , wherein the cluster includes an amount of servers in an inactive mode and the amount of servers in the active mode, wherein at least some of the amount of servers in the inactive mode are in an idle state for energy savings and rest of the amount of servers in the inactive mode are turned off or placed in a deep sleep mode for additional energy savings or used for processing non-latency sensitive jobs. 5. The system of claim 4 , wherein the controller is further configured to determine, based on a historical trend in changes in per server request rate, how many of the amount of servers in the inactive mode are to be maintained in the idle state so as to reduce set up time when the current capacity needs to be increased to match the required capacity. 6. The system of claim 1 , wherein the cluster includes an amount of servers in an inactive mode and the amount of servers in the active mode and wherein the controller maintains all servers in the inactive mode in idle state for energy savings. 7. The system of claim 1 , wherein the operating parameter includes CPU utilization. 8. The system of claim 1 , wherein the operating parameter includes latency. 9. The system of claim 1 , wherein the controller is based on a Proportional-Integral (PI) controller and the control parameters include proportional and integral gains. 10. A method performed on a computer system, comprising: determining a current value of an operating parameter based on information from a current number of active servers in a server pool; determining a deviation between the current value of the operating parameter and a target value of the operating parameter; determining a total request rate for the server pool; utilizing a feedback controller to determine change in per server request rate so as to enable a next value of the operating parameter to converge to a vicinity of the target value of the operating parameter, wherein the change in the per server request rate is determined based at least in part on the deviation, a value of the per server request rate in a previous control cycle of the feedback controller and control parameters, and wherein the per server request rate is different from the total request rate; determining, based at least in part on the change in per server request rate and the total request rate, a required number of active servers in the server pool; adjusting the current number of active servers in the server pool based on the required number of active servers to optimize energy savings and latency; and distributing, by a load balancer, incoming requests among the active servers in the server pool. 11. The method of claim 10 , wherein the feedback controller is a proportional-integral (PI) controller and the control parameters include a proportional gain and an integral gain. 12. The method of claim 10 , wherein the feedback controller is a proportional-integral-derivative (PID) controller and the control parameters include a proportional gain, an integral gain and a derivative gain. 13. The method of claim 10 , wherein the operating parameter includes any one of: CPU utilization or the latency. 14. The method of claim 10 , wherein adjusting the current number of active servers in the server pool based on the required number of active servers includes: determining that the current number of active servers in the server pool is greater than the required number of active servers; and in response, scaling down the current number of active servers in the server pool by transitioning a number of active servers in the server pool into inactive servers so that the adjusted number of active servers in the server pool matches the required number of active servers. 15. The method of claim 10 , wherein adjusting the current number of active servers in the server pool based on the required number of active servers includes: determining that the current number of active servers in the server pool is smaller than the required number of active servers; and in response, scaling up the current number of active servers in the server pool by transitioning a number of inactive servers in the server pool into active servers so that the adjusted number of active servers in the server pool matches the required number of active servers. 16. The method of claim 14 , wherein an inactive server accepts no request traffic and is placed in an idle state, a powered off state, a deep sleep state or used for processing asynchronous jobs. 17. The method of claim 14 , further comprising: maintaining a number of the inactive servers in the server pool in an idle state so that the inactive servers in the idle state can be transitioned into active servers without delay, wherein the number of the inactive servers to be maintained in the idle state is determined based on a historical trend of request rates. 18. The method of claim 14 , further comprising: establishing a model for a server type in the server pool by using empirical data and a linear fitting method to estimate correlation between the operating parameter and request rate.

Assignees

Facebook Inc

Inventors

Classifications

G06F9/5083Primary
Techniques for rebalancing the load in a distributed system · CPC title
H04L43/08
Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters · CPC title
H04L43/0817
by checking functioning · CPC title
H04L41/0896Primary
Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities (flow or congestion control using dynamic resource allocation, e.g. in-call renegotiation, H04L47/76) · CPC title
H04L47/726
Reserving resources in multiple paths to be used simultaneously (by balancing the load H04L47/125) · CPC title

Patent family

Related publications grouped by family.

View patent family 53401334

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9667498B2 cover?: A self-adaptive control system based on proportional-integral (PI) control theory for dynamic capacity management of latency-sensitive application servers (e.g., application servers associated with a social networking application) are disclosed. A centralized controller of the system can adapt to changes in request rates, changes in application and/or system behaviors, underlying hardware upgra…
Who is the assignee on this patent?: Facebook Inc
What technology area does this patent fall under?: Primary CPC classification G06F9/5083. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 30 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).