Multi-tenancy interference model for scaling in container orchestration systems
US-2023104787-A1 · Apr 6, 2023 · US
US12566639B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12566639-B2 |
| Application number | US-202117513777-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 28, 2021 |
| Priority date | Oct 28, 2021 |
| Publication date | Mar 3, 2026 |
| Grant date | Mar 3, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are disclosed for automated and dynamic compute resource allocation in an infrastructure-as-a-service (IaaS) environment. A system may determine a load threshold value corresponding to a maximum throughput of allocated resources and an active load of processing occurring at those resources. The threshold and load are compared to determine if throttling is occurring at the allocated resources. A specified range of permissible resource allocations is determined. Based on the range of permissible resource allocations, the threshold load value and the active load, the allocated resources may be modified. The modification may be a ramp-up of allocated resources to handle a throttling load or a ramp-down to reduce inefficient resource utilization and processing overhead. The ramp-up or ramp-down may be performed in periodic increments over periodic increments of time to reduce system stress and handle dynamically changing loads. A recommended permissible allocation range may be suggested.
Opening claim text (preview).
What is claimed is: 1 . A method, comprising: receiving, by a computing device via a user interface, 1) a user-defined upper limit value indicating an upper bound of a permissible allocation range for dynamically increasing a number of computing resources in a first set of computing resources that are allocated to a client, and 2) a user-defined lower limit value indicating a lower bound of the permissible allocation range for dynamically decreasing the number of computing resources in the first set of computing resources that are allocated to the client; monitoring, by the computing device over time, computing resource loads corresponding to the client; determining, by the computing device and based at least in part on the monitoring, a first load threshold value and an active load value, the active load value corresponding to a level of activity of a first set of computing resources of the client during a first time period; comparing, by the computing device, the active load value and the first load threshold value; determining, by the computing device and based at least in part on the comparison between the active load value and the first load threshold value, a throttle state of the first set of computing resources during the first time period; in response to determining the throttle state, selecting, by the computing device, a modification threshold value, wherein the user-defined upper limit value is selected for the modification threshold value when the throttle state indicates that the first set of computing resources are being throttled, and wherein the user-defined lower limit value is selected for the modification threshold value when the throttle state indicates that the first set of computing resources are not being throttled; comparing, by the computing device, the first load threshold value and the modification threshold value; and during the monitoring, modifying, by the computing device and based at least in part on the determining a difference value between the active load value and the modification threshold value, the number of computing resources in the first set of computing resources of the client, wherein modifying the number of computing resources in the first set of computing resources causes the number of computing resources to be 1) increased while adhering to the user-defined upper limit value of the permissible allocation range or 2) decreased while adhering to the user-defined lower limit value of the permissible allocation range. 2 . The method of claim 1 , wherein the first load threshold value corresponds to a maximum throughput capability of the first set of computing resources and the active load value corresponds to a rate of data being sent to the first set of computing resources. 3 . The method of claim 1 , wherein: the active load value is greater than the first load threshold value; and the throttle state indicates that the first set of computing resources is being throttled during the first time period. 4 . The method of claim 3 , wherein selecting the modification threshold value comprises selecting the user-defined upper limit value of the permissible allocation range. 5 . The method of claim 3 , wherein modifying the first set of computing resources comprises: determining, based at least in part on the difference value, a second set of computing resources that are associated with a second load threshold value; and adding the second set of computing resources to the first set of computing resources. 6 . The method of claim 1 , wherein: the active load value is less than or equal to the first load threshold value; and the throttle state indicates that the first set of computing resources is not being throttled during the first time period. 7 . The method of claim 6 , wherein selecting the modification threshold value comprises selecting the user-defined lower limit value of the permissible allocation range. 8 . The method of claim 6 , wherein modifying the first set of computing resources comprises: determining a first subset of the first set of computing resources associated with a second load threshold value less than the difference value; and removing the first subset of the first set of computing resources from the first set of computing resources. 9 . The method of claim 1 , wherein modifying the first set of computing resources comprises altering the number of computing resources at one or more time intervals of a second time period occurring subsequent to the first time period, and wherein altering the number of computing resources comprises adding or removing a static number of computing resources during each of the one or more time intervals of the second time period. 10 . The method of claim 9 , wherein the static number of computing resources is generated based at least in part on a throttling value corresponding to a predefined proportion of throttling. 11 . The method of claim 1 , wherein the number of computing resources is incrementally increased or decreased over subsequent time periods based at least in part on the difference value between the active load value and the modification threshold value. 12 . A non-transitory computer-readable storage medium storing a plurality of instructions executable by one or more processors of a server computer, the plurality of instructions when executed by the one or more processors cause the one or more processors to perform operations comprising: receiving, by a computing device via a user interface, 1) a user-defined upper limit value indicating an upper bound of a permissible allocation range for dynamically increasing a number of computing resources in a first set of computing resources that are allocated to a client, and 2) a user-defined lower limit value indicating a lower bound of the permissible allocation range for dynamically decreasing the number of computing resources in the first set of computing resources that are allocated to the client; monitoring, by the computing device over time, computing resource loads corresponding to the client; determining, by the computing device and based at least in part on the monitoring, a first load threshold value and an active load value, the active load value corresponding to a level of activity of a first set of computing resources of the client during a first time period; comparing, by the computing device, the active load value and the first load threshold value; determining, by the computing device and based at least in part on the comparison between the active load value and the first load threshold value, a throttle state of the first set of computing resources during the first time period; in response to determining the throttle state, selecting, by the computing device, a modification threshold value, wherein the user-defined upper limit value is selected for the modification threshold value when the throttle state indicates that the first set of computing resources are being throttled, and wherein the user-defined lower limit value is selected for the modification threshold value when the throttle state indicates that the first set of computing resources are not being throttled; comparing, by the computing device, the first load threshold value and the modification threshold value; and during the monitoring, modifying, by the computing device and based at least in part on the determining a difference value between the active load value and the modification threshold value, the number of computing resources in the first set of computing resources of the client wherein modifying the number of computing resources in the first set of computing resources causes the number of computing
Workload threshold · CPC title
Mechanisms to release resources · CPC title
Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title
considering the load · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.