Coordinated predictive autoscaling of virtualized resource groups

US11249810B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11249810-B2
Application numberUS-201916362545-A
CountryUS
Kind codeB2
Filing dateMar 22, 2019
Priority dateMar 22, 2019
Publication dateFeb 15, 2022
Grant dateFeb 15, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are described for optimizing the allocation of computing resources provided by a service provider network—for example, compute resources such as virtual machine (VM) instances, containers, standalone servers, and possibly other types of computing resources—among computing workloads associated with a user or group of users of the service provider network. A service provider network provides various tools and interfaces to help businesses and other organizations optimize the utilization of computing resource pools obtained by the organizations from the service provider network, including the ability to efficiently schedule use of the resources among workloads having varying resource demands, usage patterns, relative priorities, execution deadlines, or combinations thereof. A service provider network further provides various graphical user interfaces (GUIs) to help users visualize and manage the historical and scheduled uses of computing resources by users' workloads according to user preferences.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: obtaining historical data indicating a respective number of compute instances of a compute instance pool used by one or more first computing workloads over time, the compute instance pool representing an amount of compute capacity reserved by a user of a service provider network for computing workloads associated with one or more users of the service provider network; generating, based on the historical data, a prediction indicating a number of available compute instances of the compute instance pool that will be unused by the one or more first computing workloads during a future interval of time; determining requirements for executing a second computing workload, the requirements including a number of required compute instances over an estimated execution duration and a deadline for completing execution of the second computing workload; determining that the prediction indicating the number of available compute instances and the future interval of time is capable of satisfying the requirements for executing the second computing workload; and scheduling execution of the second computing workload during the future interval of time, including scheduling use of the number of required compute instances from among the compute instance pool to be used by the second computing workload. 2. The computer-implemented method of claim 1 , wherein the second computing workload is one of a plurality of additional computing workloads each associated respective estimated execution durations and respective execution deadlines, and wherein the scheduling includes determining when to execute each of the plurality of additional computing workloads by utilizing at least some of the available compute instances to run the plurality of additional computing workloads for their respective estimated execution durations before their respective execution deadlines. 3. The computer-implemented method of claim 1 , wherein the compute instance pool includes a fixed number of compute instances, and wherein the first computing workload uses a varying number of compute instances of the compute instance pool over time. 4. A computer-implemented method comprising: obtaining historical data indicating a respective amount of computing resources of a computing resource pool used by one or more first computing workloads over time; generating, based on the historical data, a prediction indicating an amount of available computing resources of the computing resource pool that will be unused by the one or more first computing workloads during a future interval of time; determining requirements for executing a second computing workload, the requirements including an amount of computing resources over an estimated execution duration and a deadline for completing execution of the second computing workload; determining that the prediction regarding indicating the amount of available computing resources and the future interval of time is capable of satisfying the requirements for executing the second computing workload; and scheduling execution of the second computing workload during the future interval of time, including scheduling use of the amount of computing resources of the computing resource pool by the second computing workload. 5. The computer-implemented method of claim 4 , wherein the second computing workload is one of a plurality of additional computing workloads each associated with respective estimated execution durations and respective execution deadlines, and wherein the scheduling includes determining when to execute each of the plurality of additional computing workloads by utilizing at least some of the predicted amount of unused computing resources to run the plurality of additional computing workloads for their respective estimated execution durations before their respective execution deadlines. 6. The computer-implemented method of claim 4 , wherein the computing resource pool includes a fixed amount of computing resources, and wherein the first computing workload uses a cyclically varying amount of computing resources from the computing resource pool over time. 7. The computer-implemented method of claim 4 , wherein execution of the first computing workload is managed by one of a batch processing service, a container execution service, a MapReduce service, and a queue service. 8. The computer-implemented method of claim 4 , wherein the respective amount of computing resources of a computing resource pool used by one or more first computing workloads over time is first historical data, and wherein the prediction regarding available computing resources that will be unused by the one or more first computing workloads during a future interval of time is generated using a recurrent neural network (RNN) trained based on second historical data related to the computing resource pool. 9. The computer-implemented method of claim 4 , further comprising scaling an amount of computing resources from the computing resource pool used by a third computing workload that is not associated with an execution deadline based on the amount of computing resources used by the one or more first computing workloads and the second computing workload over time. 10. The computer-implemented method of claim 4 , wherein the computing resource pool is a compute instance pool, wherein the compute instance pool includes a plurality of virtual machine (VM) instances or a plurality of container instances. 11. The computer-implemented method of claim 4 , further comprising scheduling use of an amount of computing resources of the computing resource pool by a third computing workload during a time period in the future, wherein the scheduling is based in part on respective priorities assigned to the second computing workload and the third computing workload. 12. The computer-implemented method of claim 4 , wherein the first computing workload and the second computing workload are associated with users that are part of a same organization. 13. The computer-implemented method of claim 4 , wherein the scheduling the amount of computing resources of the computing resource pool to be used in the future by the second computing workload is determined in part by information indicating an amount of warm-up time associated with the first computing workload. 14. The computer-implemented method of claim 4 , further comprising causing display of a graphical user interface (GUI) displaying a representation of the historical data indicating a respective amount of computing resources of the computing resource pool used by the one or more first workloads over time. 15. A system comprising: a capacity forecasting and scheduling service implemented by a first one or more electronic devices, the capacity forecasting and scheduling service including instructions that upon execution cause the capacity forecasting and scheduling service to: obtain historical data indicating a respective number of compute instances of a compute instance pool that were used by one or more first computing workloads over time, the compute instance pool representing an amount of compute capacity available to computing workloads associated with one or more users of a service provider network; generate, based on the historical data, a prediction indicating a number of available compute instances of the compute instance pool that will be unused by the one or more first computing workloads during a future interval of time; determine requirements for executing a second computing workload, the requirements including a number of required compute instances over an estimated e

Assignees

Inventors

Classifications

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Priority · CPC title

  • Learning methods · CPC title

  • G06F9/505Primary

    considering the load · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11249810B2 cover?
Techniques are described for optimizing the allocation of computing resources provided by a service provider network—for example, compute resources such as virtual machine (VM) instances, containers, standalone servers, and possibly other types of computing resources—among computing workloads associated with a user or group of users of the service provider network. A service provider network pr…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/505. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).