Multi-tenant throttling approaches

US9413680B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9413680-B1
Application numberUS-201213627278-A
CountryUS
Kind codeB1
Filing dateSep 26, 2012
Priority dateSep 26, 2012
Publication dateAug 9, 2016
Grant dateAug 9, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An opportunistic throttling approach can be used for customers of shared resources in a multi-tenant environment. Each customer can have a respective token bucket with a guaranteed fill rate. When a request is received for an amount of work to be performed by a resource, the corresponding number of tokens are obtained from, or charged against, a global token bucket. If the global bucket has enough tokens, and if the customer has not exceeded a maximum work rate or other such metric, the customer can charge less than the full number of tokens against the customer's token bucket, in order to reduce the number of tokens that need to be taken from the customer bucket. Such an approach can enable the customer to do more work and enable the customer's bucket to fill more quickly as fewer tokens are charged against the customer bucket for the same amount of work.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of managing access to shared resources in a multi-tenant environment, comprising: receiving, by one or more computer systems comprising at least a processor and a non-transitory storage medium storing code executable by the processor, a request associated with a customer, the request requesting an amount of usage of at least one shared resource; determining, by the processor of the one or more computer systems, a threshold amount based at least in part on an available amount of the at least one shared resource; comparing the amount of usage in the request to the threshold; when the amount of usage in the request exceeds the threshold, delaying processing of the request; when the amount of usage in the request is below the threshold, charging a number of tokens for the amount of usage against a global token bucket stored in the one or more computer systems, the tokens in the global token bucket associated with a unit of usage by a plurality of users in the multi-tenant environment; determining, by the one or more computer systems, a customer fill rate based at least in part upon a current fill level of the global token bucket, the customer fill rate indicating a rate associated with the customer at which a customer token bucket is filled with tokens, the customer fill rate being set to a maximum rate value when the current fill level is at or above a global maximum threshold and being set to a minimum rate value when the current fill level is at or below a global minimum threshold; determining a portion of the number of tokens for the request to be charged against the customer token bucket based at least in part upon the customer fill rate and a token utilization rate, the token utilization rate indicating a rate at which the customer utilizes the tokens for usage of the at least one shared resource, wherein the portion of the number of tokens is an improvement over the token utilization rate; charging the portion of the number of tokens against the customer token bucket; and allowing the usage of the at least one shared resource for the request when the number of tokens at least meeting the portion of the number of tokens for the request is available in the customer token bucket. 2. The computer-implemented method of claim 1 , further comprising, when the current fill level is below the global maximum threshold but above the global minimum threshold, setting the customer fill rate to a rate value between the maximum rate value and the minimum rate value. 3. The computer-implemented method of claim 2 , wherein the rate value between the maximum value and the minimum value is a function of the current fill level between the global minimum threshold and the global maximum threshold. 4. The computer-implemented method of claim 1 , further comprising: causing the global token bucket to be refilled at a global fill rate and the customer bucket to be refilled at a token bucket refill rate. 5. The computer-implemented method of claim 1 , further comprising: throttling the request while the number of tokens sufficient for the amount of usage are unable to be charged against the customer bucket. 6. The computer-implemented method of claim 1 , wherein the global minimum threshold is zero. 7. A computer-implemented method, comprising: receiving, by a computing system comprising a processor and a non-transitory storage medium storing code executable by the processor, a request associated with a customer, the request requesting usage of at least one resource; determining, by the processor of the computing system, a threshold amount based at least in part on an available amount of the at least one resource; comparing the usage in the request to the threshold; when the usage in the request exceeds the threshold, delaying processing of the request; when the usage in the request is below the threshold, determining, by the processor of the computing system, a number of tokens needed to process the request, the tokens associated with an amount of usage of a resource in a multi-tenant environment; charging the number of tokens against a global token bucket associated with a plurality of users for the resource, the global token bucket being stored in storage; determining, via a processor, a portion of the number of tokens to be charged against a customer token bucket associated with the customer for the request based at least in part upon a fill level of the global token bucket and a token utilization rate, the token utilization rate indicating a rate at which the customer utilizes the tokens for usage of the at least one resource; and allowing the usage of the at least one shared resource for the request when at least the portion of the number of tokens are available in the customer token bucket. 8. The computer-implemented method of claim 7 , wherein tokens are able to be obtained up to a maximum rate for the customer when the fill level of the global bucket at least meets a global fill threshold. 9. The computer-implemented method of claim 8 , wherein tokens are able to be obtained up to an intermediate rate for the customer when the fill level of the global bucket is less than the global fill threshold but above a global minimum threshold. 10. The computer-implemented method of claim 9 , further comprising: determining the intermediate rate using a function of a fill rate of the global bucket between the global fill threshold and the global minimum threshold. 11. The computer-implemented method of claim 10 , wherein the global fill threshold is 50% of a capacity of the global bucket, and wherein the function of the fill rate is a linear function. 12. The computer-implemented method of claim 7 , wherein the resource is capable of being accessed by the plurality of users, the plurality of users capable of having a respective customer bucket. 13. The computer-implemented method of claim 12 , further comprising: assigning a capacity and a fill rate for the respective customer bucket and the global bucket. 14. The computer-implemented method of claim 13 , wherein the capacity of the global bucket is at least as great as the capacity of the customer buckets associated with the resource. 15. The computer-implemented method of claim 7 , wherein at least a portion of a capacity of the global bucket is dedicated to resource management traffic. 16. The computer-implemented method of claim 15 , wherein tokens are further able to be obtained from a background-opportunistic bucket. 17. The computer-implemented method of claim 7 , wherein the request is received to an application programming interface (API) for a type of resource for processing the request. 18. The computer-implemented method of claim 7 , wherein the global bucket is one of a plurality of global buckets, the plurality of global buckets associated with a respective set of resources in the multi-tenant environment. 19. The computer-implemented method of claim 18 , wherein a plurality of customer buckets are allocated to the customer, the plurality of customer buckets associated with a respective set of resources and one of the plurality of global buckets in the multi-tenant environment. 20. A computing system, comprising: at least one processor; and at least one memory device including instructions that, when executed by the at least one processor, cause the computing system to: receive a request associated with a customer, the request requesting usage of at least one resource; determine a threshold amount

Assignees

Inventors

Classifications

  • H04L47/762Primary

    triggered by the network · CPC title

  • H04L69/321Primary

    Interlayer communication protocols or service data unit [SDU] definitions; Interfaces between layers · CPC title

  • in the network layer [OSI layer 3], e.g. X.25 (H04L69/16 takes precedence) · CPC title

  • for controlling access to devices or network resources · CPC title

  • Notification aspects · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9413680B1 cover?
An opportunistic throttling approach can be used for customers of shared resources in a multi-tenant environment. Each customer can have a respective token bucket with a guaranteed fill rate. When a request is received for an amount of work to be performed by a resource, the corresponding number of tokens are obtained from, or charged against, a global token bucket. If the global bucket has eno…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification H04L47/762. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Aug 09 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).