This was written via Markdown. Apologies if its unclear, or your glasses are missing 😅.
Given:
- a bucket
bof a specific sizen. - a token
irepresenting a unit of consumption e.g. each token may represent an object/element such as # of API requests, # of TCP packets, # of order basket item quantities, integer prices etc. Basically, what you want “consumed” from the recipient. - a token refill rate which determines how fast
rrtokens are added to the bucket pertseconds (can be larger time periods e.g. minutes, hours, days). - a consumption rate
rwhich determines how fastitokens are consumed from the bucket. This can be an API request rate per second, purchase requests per second etc.., each having a given number of “tokens” they want to consume per given time period.
The formula is (as far as I understand) as follows:
- the bucket
bis initialized withnnumber of tokens. If not initialized with initial tokens, all initial requests will fail until at least 1 token is available for consumption, given requirement #3 below. - For every consumption request, a given number of tokens
#iis consumed from the bucket.- if
#iis equal to or more thann, the tokens are consumed, and the remainder calculated and saved back into bucketbasnfor this request. The request passes. - if
#iis less thann, the request is ideally denied, but other mechanisms can be applied e.g. waiting foritokens to be available. Sonremains untouched for this specific request. The request fails.
- if
- Given the rate which determines how fast tokens are refilled i.e. if every
rr/tperiod of time passes, a single token is added to the bucket, up to a maximum of sizen:- if the rate of refill is 5 tokens every 10 seconds, divide 5 tokens / 10 seconds, and you get 1 token every 2 seconds on best effort (i.e. depends on availability of resources for every action/request). So if your consumption rate is 10 tokens every 10 seconds, that makes your consumption rate
rto be 10 tokens/10 seconds, thusr== 1/s. Bucket can only refill at effectively 5 tokens/10 seconds, making therr/t== 0.5/s i.e. 1 token every 2 seconds, thus half your requests will probably fail.
- if the rate of refill is 20 tokens every 5 seconds, divide 20 tokens / 5 seconds, and you get 4 tokens every 1 second on best effort. Thus if your consumption rate is 10 tokens every 10 seconds, that makes your consumption rate
rto be effectively 10 tokens/10 seconds, thusr== 1/s. Bucket can be refilled at 20 tokens every 5 seconds, that makes the effective minimum 4 requests/1 second i.e.rr/t== 4/s. All your requests should probably go through.
- if the rate of refill is 5 tokens every 10 seconds, divide 5 tokens / 10 seconds, and you get 1 token every 2 seconds on best effort (i.e. depends on availability of resources for every action/request). So if your consumption rate is 10 tokens every 10 seconds, that makes your consumption rate
Disclaimer
- This has NOT taken into account bursts! The above is an ideal situation which will happen as part of daily life, but definitely expect the bursts! I might add that section later.
- Race conditions are also possible, so beware!
Why I like this algorithm? Don’t be greedy 😁. BUT! When you have an audience whose usage you can determine or predict beforehand, this algorithm makes great sense to me! Payments APIs, Rate limiting for uploads, static content which might be exposed, any requests for sensitive info etc..



