Trajectory-based hierarchical autoscaling for serverless applications

US12020036B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12020036-B2
Application numberUS-202117513246-A
CountryUS
Kind codeB2
Filing dateOct 28, 2021
Priority dateOct 28, 2021
Publication dateJun 25, 2024
Grant dateJun 25, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method includes monitoring, during a first time interval, traffic associated with one or more applications executed by a cluster of compute nodes and determining, in view of the traffic associated with the one or more applications during the first time interval, that the traffic is predicted to exceed a capacity threshold of the cluster of compute nodes at an end of a second time interval. The method further includes initiating startup of an additional compute node to be added to the cluster of compute nodes for executing replicas of the one or more applications.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: monitoring, during a first time interval, traffic associated with one or more applications executed by a cluster of compute nodes, wherein each compute node of the cluster of compute nodes comprises a virtual machine; scaling a number of replicas of the one or more applications based on the traffic associated with the one or more applications over a second time interval, the second time interval corresponding to an amount of time associated with instantiating a replica of the one or more applications; determining, by a processing device, in view of the traffic associated with the one or more applications during the first time interval, that the traffic is predicted to exceed a capacity threshold of the cluster of compute nodes at an end of a third time interval, wherein the third time interval is longer than the second time interval; and initiating startup of an additional compute node comprising an additional virtual machine to be added to the cluster of compute nodes for executing replicas of the one or more applications based on the traffic being predicted to exceed the capacity threshold of the cluster of compute nodes at the end of the third time interval, wherein the third time interval corresponds to an amount of time associated with starting up the additional compute node. 2. The method of claim 1 , wherein the one or more applications comprise one or more serverless applications. 3. The method of claim 1 , wherein the traffic comprises a number of concurrent requests received by each of the one or more applications executed by the cluster of compute nodes. 4. The method of claim 1 , wherein monitoring the traffic associated with the one or more applications executed by the cluster of compute nodes comprises: scraping one or more traffic metrics from each of the one or more applications; and determining a total traffic level of the cluster during the first time interval in view of the one or more traffic metrics from each of the one or more applications. 5. The method of claim 4 , wherein determining that the traffic associated with the one or more applications is predicted to exceed the capacity threshold of the cluster of compute nodes at the end of the third time interval comprises: extrapolating the total traffic level of the cluster during the first time interval over the third time interval. 6. The method of claim 5 , wherein extrapolating the total traffic level of the cluster comprises: applying a statistical analysis or machine learning model on the total traffic level of the cluster during the first time interval to estimate a future total traffic level of the cluster at the end of the third time interval. 7. A system comprising: a memory; and a processing device, operatively coupled to the memory, the processing device to: monitor, during a first time interval, traffic associated with one or more applications executed by a cluster of compute nodes, wherein each compute node of the cluster of compute nodes comprises a virtual machine; scale a number of replicas of the one or more applications based on the traffic associated with the one or more applications over a second time interval, the second time interval corresponding to an amount of time associated with instantiating a replica of the one or more applications; determine, in view of the traffic associated with the one or more applications during the first time interval, that the traffic is predicted to exceed a capacity threshold of the cluster of compute nodes at an end of a third time interval, wherein the third time interval is longer than the second time interval; and initiate startup of an additional compute node comprising an additional virtual machine to be added to the cluster of compute nodes for executing replicas of the one or more applications based on the traffic being predicted to exceed the capacity threshold of the cluster of compute nodes at the end of the third time interval, wherein the third time interval corresponds to an amount of time associated with starting up the additional compute node. 8. The system of claim 7 , wherein the one or more applications comprise one or more serverless applications. 9. The system of claim 7 , wherein the traffic comprises a number of concurrent requests received by each of the one or more applications executed by the cluster of compute nodes. 10. The system of claim 7 , wherein to monitor the traffic associated with the one or more applications executed by the cluster of compute nodes, the processing device is to: scrape one or more traffic metrics from each of the one or more applications; and determine a total traffic level of the cluster during the first time interval in view of the one or more traffic metrics from each of the one or more applications. 11. The system of claim 10 , wherein to determine that the traffic associated with the one or more applications is predicted to exceed the capacity threshold of the cluster of compute nodes at the end of the third time interval, the processing device is to: extrapolate the total traffic level of the cluster during the first time interval over the third time interval. 12. The system of claim 11 , wherein to extrapolate the total traffic level of the cluster, the processing device is to: apply a statistical analysis or machine learning model on the total traffic level of the cluster during the first time interval to estimate a future total traffic level of the cluster at the end of the third time interval. 13. A non-transitory computer-readable storage medium including instructions that, when executed by a processing device, cause the processing device to: monitor, during a first time interval, traffic associated with one or more applications executed by a cluster of compute nodes, wherein each compute node of the cluster of compute nodes comprises a virtual machine; scale a number of replicas of the one or more applications based on the traffic associated with the one or more applications over a second time interval, the second time interval corresponding to an amount of time associated with instantiating a replica of the one or more applications; determine, by the processing device, in view of the traffic associated with the one or more applications during the first time interval, that the traffic is predicted to exceed a capacity threshold of the cluster of compute nodes at an end of a third time interval, wherein the third time interval is longer than the second time interval; and initiate startup of an additional compute node comprising an additional virtual machine to be added to the cluster of compute nodes for executing replicas of the one or more applications based on the traffic being predicted to exceed the capacity threshold of the cluster of compute nodes at the end of the third time interval, wherein the third time interval corresponds to an amount of time associated with starting up the additional compute node. 14. The non-transitory computer-readable storage medium of claim 13 , wherein the one or more applications comprise one or more serverless applications. 15. The non-transitory computer-readable storage medium of claim 13 , wherein the traffic comprises a number of concurrent requests received by each of the one or more applications executed by the cluster of compute nodes. 16. The non-transitory computer-readable storage medium of claim 13 , wherein to monitor the traffic associated with the one or more applications executed by the cluster of compute nodes, the processing device is to: scrape one or more traffic metrics from each of

Assignees

Inventors

Classifications

  • for systems · CPC title

  • where the computing system component is a software system · CPC title

  • Grid computing · CPC title

  • with limitation or expansion of the discovery scope · CPC title

  • Peer-to-peer [P2P] networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12020036B2 cover?
A method includes monitoring, during a first time interval, traffic associated with one or more applications executed by a cluster of compute nodes and determining, in view of the traffic associated with the one or more applications during the first time interval, that the traffic is predicted to exceed a capacity threshold of the cluster of compute nodes at an end of a second time interval. Th…
Who is the assignee on this patent?
Red Hat Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/5083. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 25 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).