Dispatch dissected
The more difficult something is to achieve, the more people like it
-- Susan M. Weinschenk
On APIs
In my more cynical moods, I wonder if this explains why there is so much traction behind some Scala libraries. Don't get me wrong, I also use them, and eventually came to appreciate them, but many of them fail at least a couple of rules from Joshua Bloch's API Design by Bumper Stickers talk:
- APIs should be self-documenting: It should rarely require documentation to read code written to a good API. In fact, it should rarely require documentation to write it.
- Obey the principle of least astonishment: Every method should do the least surprising thing it could, given its name. If a method doesn't do what users think it will, bugs will result.
- Names matter: Strive for intelligibility, consistency, and symmetry. Every API is a little language, and people must learn to read and write it. If you get an API right, code will read like prose.
I remember Barbara Liskov saying that APIs should be optimized for readability rather than for compactness. Every Scala programmer should have that printed in capitals right next to his computer.
Anyway, if the API is not so self-documenting, sometimes astonishing and names appear to be chosen randomly, documentation does help; it helps you to get into that ellusive group of people who master the API, and being part of that group - according to the theory of congnitive dissonance - will be sufficiently motivating to plough through the documentation (like reading this article), and tell your mind that it was completely worth the effort.
Use a Java API?
Some people would argue that you might as well use a non-Scala API, but hold on: that violates another principle coined by Josh Bloch:
When in Rome, do as the Romans do: APIs must coexist peacefully with the platform, so do what is customary. It is almost always wrong to transliterate an API from one platform to another.
Dispatch dissected
Dispatch underpinnings
Dispatch is bolted on top of the HTTP Client of Apache's HTTP Components project. That is important to remember, since everything under the hood deals with the raw primitives of that library. It also means that if you would ever consider extending Dispatch yourself, you will quickly run into this library.
HttpExecutor, Request and Handler, the golden braid
From the outside, there are three important abstractions
HttpExecutor
The HttpExecutor is the trait that defines the contract for objects resonsible for sending the requests over the network and passing the responses back to something handling the response, the Handler
that I will discuss in a bit.
There are four methods you would have to implement in order to turn the trait into something useful: execute
, executeWithCallback
, consumeContent
and shutdownClient
. The trait itself just defines those operations, and expresses a couple of higher order operations in terms of these primitives. In fact, you will almost never call execute
, executeWithCallback
or consumeContent
directly.
The operations that you do call directly are these:
apply[T](hand: Handler[T]): HttpPackage[T]
x[T](hand: Handler[T]): HttpPackage[T]
when[T](chk: Int => Boolean)(hand: Handler[T]): HttpPackage[T]
apply[T](callback: Callback[T]): HttpPackage[T]
More on the mysterious HttpPackage[T]
in a minute. For now, it's important to to remember that these four methods are really (in normal circumstances) the methods you use, and they all delegate to execute
and executeWithCallback
. Let's look at some examples. In all of these examples, assume that executor is a concrete subclass of HttpExecutor
.
// Get the content at www.google.com as a string. But only if the
// status code is some value between 200 and 204.
executor(url("www.google.com") as_str)
// Identical, but a little bit more verbose, and not very idiomatic
executor.apply(url("www.google.com") as_str)
// The same, but ignore the status code, which means also handle
// the request in case of a 500 error, etc.
executor.x(url("www.google.com") as_str)
// The same, but setting specific conditions for when the response
// should be handled. (In this case, only if the status code is
// 200
executor.when(_ == 200)(url("www.google.com") as_str)
Handler
All of the methods listed above accept a single type of argument: a socalled Handler
.
A Handler[T]
is a case class combining the definition of the request with a function acting upon the response (returning a value of type T) and a listener defining what should be done in case of exceptions.
Since it's a case class, its not extended by inheritance. However, there are a bunch of operators that allow you to extend the Handler
by specializing either the definition of the request, the function acting upon the response or the exception listener.
Request
As I said, a Handler
combines a number of things, including the definition of the request. In fact, normally the Handler
is built by calling methods on the Request
object, the object defining just the request details.
Under the hood, the Request
object carries around all data required by Apache's HTTP client to create the request: the headers to be sent, the host and port number, the path, the query parameters, etc. On top of that it supports the operators to turn this request definition into a fullblown Handler
.
Back to the big picture
So, just to clarify the way you normally use Dispatch:
- You create
Request
object using the factory methods available, such asurl(...)
. - You create a
Handler
from thatRequest
object, by calling one of its operators, in this caseas_str
, which will grab the contents as a String and return that. (Check the periodic table of dispatch operators for all other operators.) - You pass this to the
apply
,x
orwhen
operation of an appropriateHttpExecutor
. One of these implementations is thedispatch.Http
object, but there are others, and it's important to understand the differences between them. More on that later.
Intermezzo: the mysterious HttpPackage[T]
So, what is that mysterious HttpPackage[T]
set as the return type of almost all HttpExecutor
operations? The answer is easy, but also a little dissapointing: it's undefined. HttpPackage[T]
is just a type alias of something that yet has to be defined by subclasses of HttpExecutor
. If that sounds weird, then perhaps it helps to consider the different ways in which your client could deal with requests it needs to send.
- In some cases, a client might be able to send the request and then forget about the response.
- In other cases, a client might be able to send out a request, and then continue doing a bunch of things, checking for the response at some later point in time.
- In some cases, there might simply not be anything else left to do. The client should just wait for the results to be returned, then wake up and continue processing the results.
The HttpExecutor
trait aims to cater for all of these cases. However, you can imagine that there is a difference between the way the API would ideally look in all of these cases.
Introducting the HttpPackage[T]
type alias, allows Dispatch to specialize the return type based on the specific subtype of the HttpExecutor
. The HttpExecutor
implementation for blocking calls defines HttpPackage[T]
to be just T
. A thread safe implementation of HttpExecutor
defines HttpPackage[T]
to be a Future[T]
. That leaves the clients of that HttpExecutor
the choice to continue work on other things or to block for the results to arrive.
HttpExecutor Class Hierarchy
For educational purposes, let's take a look at the geneology of the HttpExecutor family:
BlockingCallback
: I mentioned theexecuteWithCallback
method before, and considered it one of the methods that you normally would not call directly. In fact, it's questionable if your calls you ever hit this method, in every day use of HTTP. That method accepts aCallback
implementation that will not only asynchronously fire the request, but also incrementally handle the results. (That is, if some HTTP content is getting passed in, it will call yourCallback
implementation to handle it. Instead of waiting for the entire response to have arrived, it process chunks of data of the response once whenever these chunks become available.BlockingHttp
is the subtype ofHttpExecutor
from which almost all other traits inherit. It will execute the request using a non-threadsafe Http client, and block for the results of the handler to become available.dispatch.Http
is a concrete subclass ofBlockingHttp
.BlockingHttp
itself is abstract, and its definition ofHttpPackage[T]
is still undefined.dispatch.Http
turns it into something that is no longer abstract, by definingHttpPackage[T]
to mean simplyT
. If that sounds a little too abstract for your taste, then simply remember that you can create instance of this class, and then have an executor that will act in a blocking way, without being threadsafe.Safety
is the trait that allows you to mix in thread safety into subtypes ofBlockingHttp
.dispatch.Http
is also an object, extending the classdispatch.Http
, extending it withSafety
. That means you have something that you could use throughout your entire application, without running into too much trouble, unless the number of threads exceeds the number of threads defined bySafety
.Future
is a subtype ofSafety
, that extends threadsafety with asynchronicity. Effectively means that subclasses ofBlockingHttp
extended withFuture
will haveapply
,when
andx
operations that return a Future, instead of the result of the function defined by theHandler
. In other words, in case of theHttp
object, callingHttp(url(...) as_str)
would normally return aString
. However, if you - instead of usingHttp
- would use an instance ofdispatch.thread.Http
, then that same call would return aFuture[String]
immediately.dispatch.thread.Http
is a concrete subclass ofBlockingHttp
andSafety
, offering the behaviour I outlined a second ago.
So where does this lead us?
Hopefully, this will give you some clues how to extend Dispatch to mix in the behaviour you want. At Udini, we needed a couple of things in addition to what Dispatch normally provides:
A retry policy: sometimes, the remote services we call fail. In those cases, we want to retry the call. We implemented that by having a new trait called
RetryPolicy
that extendsdispatch.Http
, overriding theexecute
operation with something that has the retry behaviour. (You cannot extendapply
,x
orwhen
, since it's final. And even if it would not have been final, it would still have implied repeating code in all of these methods. By sticking it intoexecute
, it works for all of these operations.)Failover: in some cases, for some services we have information on replicas being available. (This mostly applies to services within our own EC2 environment, in which we don't have an internal load balancer offering failover capabilities.) In those cases, we need to implement failover using replica awareness inside the client, however, we want to do it in a DRY way. Adding another trait called
ReplicaAwareness
allowed us to do just that.
I have to say that in that particular case, rather than implementing execute
in an alternative way, we decided to change the client and have replica awareness inside a DefaultHttpClient
subclass returned by the make_client
call on BlockingHttp
subtypes. That doesn't mean it couldn't or shouldn't have been done differently; it just reflects our understanding of Dispatch at that time.
- Detailed error logging: the default implementation of the
HttpExecutor
throwsStatusCode
exceptions in case the status code is different than what your calls expect. Unfortunately, theseStatusCode
exceptions don't carry a lot of detailed information on the original request, which means our logging missed on important details. By implementing another traitDetailedStatusCodeReporting
and overide the definition of theexecute
operation, we were able to replace theStatusCode
exception with a more detailed version of that exception, carrying information on the request that failed.
In summary, we could eventually implement all of the additional executor behaviour we needed by having a trait override the definition of HttpExecutor
's execute
operation.
Let me know if you found this explanation useful. Some of what I wrote here will end up in a new edition of the little Dispatch book I wrote. Let me know what you're missing, and I will stick that in as well.
(Expect a more detailed explanation of the retry policy, failover and detailed error logging traits in some future blog posts.)