Dynamically identifying target capacity when scaling cloud resources

US9722945B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9722945-B2
Application numberUS-201414307745-A
CountryUS
Kind codeB2
Filing dateJun 18, 2014
Priority dateMar 31, 2014
Publication dateAug 1, 2017
Grant dateAug 1, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments are directed to preventing flapping when auto-scaling cloud resources. In one scenario, a computer system accesses information specifying a target operational metric that is to be maintained on a plurality of cloud resources. The computer system determines a current measured value for the target operational metric for at least some of the cloud resources. The computer system further calculates a scaling factor based on the target operational metric and the current measured value, where the scaling factor represents an amount of variance between the target operational metric and the current measured value. The computer system also calculates a delta value representing a modified quantity of cloud resources modified by the calculated scaling factor and determines whether a scaling action is to occur based on the calculated delta value.

First claim

Opening claim text (preview).

We claim: 1. A computer-implemented method for auto-scaling cloud resources in order to increase or decrease the number of virtual machine (VM) instances used to meet a current computing load being handled by a service, the computer-implemented method being performed by one or more processors executing computer executable instructions for the computer-implemented method, and the computer-implemented the method comprising acts of: periodically accessing information at given intervals, wherein the accessed information comprises: a determination of how many VM instances (n) are currently being used to meet a computing load of a service; and measured performance metrics that quantify the computing load the service is currently handling; for at least one given measured performance metric, obtaining a per-capita load (PCL) by dividing the given measured performance metric by the number of VM instances being used to meet the computing load of the service; accessing a user-defined per-capita target threshold (PCTT) representing an amount of the computing load each VM instance is to handle for the given performance metric; determining a scaling action to be taken, but without flapping the VM instances by causing them to enter an undesirable cycle of alternately removing and then adding the same number of VM instances until the current computing load changes, by performing the following: calculating a scaling factor based on dividing the PCL by the PCTT for the given performance metric, wherein the scaling factor represents an amount of variance between the at least one given measured performance metric and the user-defined per-capita target threshold (PCTT) representing an amount of the computing load each VM instance is to handle for the given performance metric; calculating the number of VM instances required to scale to the scaling factor by determining a delta value based on the difference between i) the number of VM instances (n) currently being used to meet the computing load of the service and ii) n times the scaling factor; determining whether a scaling action is to occur based on the calculated delta value; and when determined, performing the scaling action as indicated by the calculated delta value. 2. The computer-implemented method of claim 1 , wherein the scaling action comprises adding virtual machines when the delta value is positive or removing virtual machines when the delta value is negative. 3. The computer-implemented method of claim 1 , wherein the scaling action comprises increasing or decreasing the size of at least one virtual machine. 4. The computer-implemented method of claim 1 , wherein a at least one of the performance metrics is prioritized higher than at least one other performance metric. 5. The computer-implemented method of claim 4 , wherein health of the virtual machines is prioritized over cost savings. 6. The computer-implemented method of claim 4 , wherein the scaling factor is calculated for each of the measured performance metrics and wherein a specified scaling factor is selected based on the prioritization. 7. The computer-implemented method of claim 6 , wherein the specified scaling factor comprises a maximum scaling factor that represents the highest amount of variance between the at least one given measured performance metric and the user-defined per-capita target threshold (PCTT). 8. The computer-implemented method of claim 1 , further comprising: determining that at least one performance metric, if implemented, would cause at least one service level agreement (SLA) to no longer be enforceable; and generating a notification that the selected operational metric would result in the SLA no longer being enforceable. 9. The computer-implemented method of claim 1 , wherein upon determining that a scaling action is to occur based on the calculated delta value, the computer-implemented method further comprises calculating a projected impact the scaling action would have on the total capacity of the VM instances currently being used to meet the computing load of the service. 10. A computer-implemented method for auto-scaling cloud resources in order to increase or decrease the number of virtual machine (VM) instances used to meet a current computing load being handled by a service, the computer-implemented method being performed by one or more processors executing computer executable instructions for the computer-implemented method, and the computer-implemented the method comprising acts of: periodically accessing information at given intervals, wherein the accessed information comprises: a determination of how many VM instances (n) are currently being used to meet a computing load of a service; and measured performance metrics that quantify the computing load the service is currently handling; for at least one given measured performance metric, obtaining a per-capita load (PCL) by dividing the given measured performance metric by the number of VM instances being used to meet the computing load of the service; accessing a user-defined per-capita target threshold (PCTT) representing an amount of the computing load each VM instance is to handle for the given performance metric; determining a scaling action to be taken, but without flapping the VM instances by causing them to enter an undesirable cycle of alternately removing and then adding the same number of VM instances until the current computing load changes, by performing the following: calculating a scaling factor based on dividing the PCL by the PCTT for the given performance metric, wherein the scaling factor represents an amount of variance between the at least one given measured performance metric and the user-defined per-capita target threshold (PCTT) representing an amount of the computing load each VM instance is to handle for the given performance metric; calculating the number of VM instances required to scale to the scaling factor by determining a delta value based on the difference between i) the number of VM instances (n) currently being used to meet the computing load of the service and ii) n times the scaling factor; and determining whether a scaling action is to occur based on the calculated delta value, and when a scaling action is determined, performing the following: projecting the impact the scaling action will have on total capacity of the service to meet the current computing load; and if the projected impact indicates that the scaling action will result in a new total capacity of the service that is unable to handle the current computing load, reducing the magnitude of the scaling action until the new total capacity will be able to serve the current computing load. 11. The computer-implemented method of claim 10 , wherein the scaling action comprises adding virtual machines when the delta value is positive or removing virtual machines when the delta value is negative. 12. The computer-implemented method of claim 10 , wherein the scaling action comprises increasing or decreasing the size of at least one virtual machine. 13. The computer-implemented method of claim 10 , wherein at least one of the performance metrics is prioritized higher than at least one other performance metric. 14. The computer-implemented method of claim 13 , wherein health of the virtual machines is prioritized over cost savings. 15. The computer-implemented method of claim 13 , wherein the scaling factor is calculated for each of the measured performance metrics and wherein a specified scaling factor is selected based on the prioritization. 16. The computer-implemented method of claim 15 , wherein t

Assignees

Inventors

Classifications

  • G06F9/5072Primary

    Grid computing · CPC title

  • by checking functioning · CPC title

  • to enhance reliability, e.g. reduce downtime · CPC title

  • Techniques for rebalancing the load in a distributed system · CPC title

  • Centralised allocation of resources · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9722945B2 cover?
Embodiments are directed to preventing flapping when auto-scaling cloud resources. In one scenario, a computer system accesses information specifying a target operational metric that is to be maintained on a plurality of cloud resources. The computer system determines a current measured value for the target operational metric for at least some of the cloud resources. The computer system further…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F9/5072. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 01 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).