Rate limiting

Rate limiting made simple

While I write this, Amazon's Simple Email Service has a maximum send rate of 5 emails per second. That might seem like a lot, but actually it's not even all that much. That means you can hit the limit, and we all know that if you can hit a limit, you will hit a limit. It's the specialized version of Murphy's law.

It therefore wasn't all that unexpected to find we bumped into that limit as well. While I was doing some refactoring last year, I realized it actually wasn't all that uncommon. Many of the APIs we were using had rate limits. It's just that implementing something to gracefully handle, or – even better – avoid those situations, was just more work than we were willing to do.

Guava's RateLimiter

I was happy to find Guava had a class called RateLimiter, which is to some extent doing what you want it to do. You construct an instance of that class, passing the number of requests you expect to accept per second, and then whenever you're making a call, first acquire() a permit to make the call.

val rateLimiter = new RateLimiter(20) // 20 requests per second

def doRateLimitedStuff() {
    // do the thing that is limited

There a couple of disadvantages though: like, you cannot transparently swap a rate-limited version of your call for a non-rate-limited version of your call, for instance.

A function for rate-limited functions

Wouldn't it be nice if you could just make any function rate limited? I was wondering about that, and used something similar to the code below. I'm still playing around with it, so it certainly isn't the final solution, but perhaps you find it useful.

import nl.flotsam.rate._
import scala.concurrent._
import scala.concurrent.duration._

// For sake of example, we're using the global context
import ExecutionContext.global

def times2(i: Int) = i * 2

val limit = RateLimit(2 per (5 seconds))

val times2Limited = limit(times2 _)

times2Limited(2) // Future(4)
times2Limited(3) // Future(6)
times2Limited(4) // Will not start executing for a while

The RateLimit object contains the actual RateLimiter. That way, you can create rate-limited versions of various functions, all guarded by that same limit. In this case, I'm limiting a times2 function.

The rate-limited version of this function is returning a Future. It's not blocking, your code can just continue, but the Future will only start executing once there are permits available.


One of the issues with this approach – and with Guava's RateLimiter in general – is that it blocks a thread. That is, for every request coming in, if there's no permit available, it will just block. That could mean a lot of threads being blocked. Not good for obvious reasons. So I certainly wouldn't use any of this for situations where you run the risk of seriously overrunning the number of requests you can made. For the occasional overrun of just a few requests, it could be okay though.