The more difficult something is to achieve, the more people like it
-- Susan M. Weinschenk
In my more cynical moods, I wonder if this explains why there is so much traction behind some Scala libraries. Don't get me wrong, I also use them, and eventually came to appreciate them, but many of them fail at least a couple of rules from Joshua Bloch's API Design by Bumper Stickers talk:
- APIs should be self-documenting: It should rarely require documentation to read code written to a good API. In fact, it should rarely require documentation to write it.
- Obey the principle of least astonishment: Every method should do the least surprising thing it could, given its name. If a method doesn't do what users think it will, bugs will result.
- Names matter: Strive for intelligibility, consistency, and symmetry. Every API is a little language, and people must learn to read and write it. If you get an API right, code will read like prose.
I remember Barbara Liskov saying that APIs should be optimized for readability rather than for compactness. Every Scala programmer should have that printed in capitals right next to his computer.
Anyway, if the API is not so self-documenting, sometimes astonishing and names appear to be chosen randomly, documentation does help; it helps you to get into that ellusive group of people who master the API, and being part of that group - according to the theory of congnitive dissonance - will be sufficiently motivating to plough through the documentation (like reading this article), and tell your mind that it was completely worth the effort.
Use a Java API?
Some people would argue that you might as well use a non-Scala API, but hold on: that violates another principle coined by Josh Bloch:
When in Rome, do as the Romans do: APIs must coexist peacefully with the platform, so do what is customary. It is almost always wrong to transliterate an API from one platform to another.
Dispatch is bolted on top of the HTTP Client of Apache's HTTP Components project. That is important to remember, since everything under the hood deals with the raw primitives of that library. It also means that if you would ever consider extending Dispatch yourself, you will quickly run into this library.
HttpExecutor, Request and Handler, the golden braid
From the outside, there are three important abstractions
The HttpExecutor is the trait that defines the contract for objects resonsible for sending the requests over the network and passing the responses back to something handling the response, the
Handler that I will discuss in a bit.
There are four methods you would have to implement in order to turn the trait into something useful:
shutdownClient. The trait itself just defines those operations, and expresses a couple of higher order operations in terms of these primitives. In fact, you will almost never call
The operations that you do call directly are these:
apply[T](hand: Handler[T]): HttpPackage[T]
x[T](hand: Handler[T]): HttpPackage[T]
when[T](chk: Int => Boolean)(hand: Handler[T]): HttpPackage[T]
apply[T](callback: Callback[T]): HttpPackage[T]
More on the mysterious
HttpPackage[T] in a minute. For now, it's important to to remember that these four methods are really (in normal circumstances) the methods you use, and they all delegate to
executeWithCallback. Let's look at some examples. In all of these examples, assume that executor is a concrete subclass of
// Get the content at www.google.com as a string. But only if the // status code is some value between 200 and 204. executor(url("www.google.com") as_str) // Identical, but a little bit more verbose, and not very idiomatic executor.apply(url("www.google.com") as_str) // The same, but ignore the status code, which means also handle // the request in case of a 500 error, etc. executor.x(url("www.google.com") as_str) // The same, but setting specific conditions for when the response // should be handled. (In this case, only if the status code is // 200 executor.when(_ == 200)(url("www.google.com") as_str)
All of the methods listed above accept a single type of argument: a socalled
Handler[T] is a case class combining the definition of the request with a function acting upon the response (returning a value of type T) and a listener defining what should be done in case of exceptions.
Since it's a case class, its not extended by inheritance. However, there are a bunch of operators that allow you to extend the
Handler by specializing either the definition of the request, the function acting upon the response or the exception listener.
As I said, a
Handler combines a number of things, including the definition of the request. In fact, normally the
Handler is built by calling methods on the
Request object, the object defining just the request details.
Under the hood, the
Request object carries around all data required by Apache's HTTP client to create the request: the headers to be sent, the host and port number, the path, the query parameters, etc. On top of that it supports the operators to turn this request definition into a fullblown
Back to the big picture
So, just to clarify the way you normally use Dispatch:
- You create
Requestobject using the factory methods available, such as
- You create a
Requestobject, by calling one of its operators, in this case
as_str, which will grab the contents as a String and return that. (Check the periodic table of dispatch operators for all other operators.)
- You pass this to the
whenoperation of an appropriate
HttpExecutor. One of these implementations is the
dispatch.Httpobject, but there are others, and it's important to understand the differences between them. More on that later.
Intermezzo: the mysterious HttpPackage[T]
So, what is that mysterious
HttpPackage[T] set as the return type of almost all
HttpExecutor operations? The answer is easy, but also a little dissapointing: it's undefined.
HttpPackage[T] is just a type alias of something that yet has to be defined by subclasses of
HttpExecutor. If that sounds weird, then perhaps it helps to consider the different ways in which your client could deal with requests it needs to send.
- In some cases, a client might be able to send the request and then forget about the response.
- In other cases, a client might be able to send out a request, and then continue doing a bunch of things, checking for the response at some later point in time.
- In some cases, there might simply not be anything else left to do. The client should just wait for the results to be returned, then wake up and continue processing the results.
HttpExecutor trait aims to cater for all of these cases. However, you can imagine that there is a difference between the way the API would ideally look in all of these cases.
HttpPackage[T] type alias, allows Dispatch to specialize the return type based on the specific subtype of the
HttpExecutor implementation for blocking calls defines
HttpPackage[T] to be just
T. A thread safe implementation of
HttpPackage[T] to be a
Future[T]. That leaves the clients of that
HttpExecutor the choice to continue work on other things or to block for the results to arrive.
HttpExecutor Class Hierarchy
For educational purposes, let's take a look at the geneology of the HttpExecutor family:
BlockingCallback: I mentioned the
executeWithCallbackmethod before, and considered it one of the methods that you normally would not call directly. In fact, it's questionable if your calls you ever hit this method, in every day use of HTTP. That method accepts a
Callbackimplementation that will not only asynchronously fire the request, but also incrementally handle the results. (That is, if some HTTP content is getting passed in, it will call your
Callbackimplementation to handle it. Instead of waiting for the entire response to have arrived, it process chunks of data of the response once whenever these chunks become available.
BlockingHttpis the subtype of
HttpExecutorfrom which almost all other traits inherit. It will execute the request using a non-threadsafe Http client, and block for the results of the handler to become available.
dispatch.Httpis a concrete subclass of
BlockingHttpitself is abstract, and its definition of
HttpPackage[T]is still undefined.
dispatch.Httpturns it into something that is no longer abstract, by defining
HttpPackage[T]to mean simply
T. If that sounds a little too abstract for your taste, then simply remember that you can create instance of this class, and then have an executor that will act in a blocking way, without being threadsafe.
Safetyis the trait that allows you to mix in thread safety into subtypes of
dispatch.Httpis also an object, extending the class
dispatch.Http, extending it with
Safety. That means you have something that you could use throughout your entire application, without running into too much trouble, unless the number of threads exceeds the number of threads defined by
Futureis a subtype of
Safety, that extends threadsafety with asynchronicity. Effectively means that subclasses of
xoperations that return a Future, instead of the result of the function defined by the
Handler. In other words, in case of the
Http(url(...) as_str)would normally return a
String. However, if you - instead of using
Http- would use an instance of
dispatch.thread.Http, then that same call would return a
dispatch.thread.Httpis a concrete subclass of
Safety, offering the behaviour I outlined a second ago.
So where does this lead us?
Hopefully, this will give you some clues how to extend Dispatch to mix in the behaviour you want. At Udini, we needed a couple of things in addition to what Dispatch normally provides:
A retry policy: sometimes, the remote services we call fail. In those cases, we want to retry the call. We implemented that by having a new trait called
dispatch.Http, overriding the
executeoperation with something that has the retry behaviour. (You cannot extend
when, since it's final. And even if it would not have been final, it would still have implied repeating code in all of these methods. By sticking it into
execute, it works for all of these operations.)
Failover: in some cases, for some services we have information on replicas being available. (This mostly applies to services within our own EC2 environment, in which we don't have an internal load balancer offering failover capabilities.) In those cases, we need to implement failover using replica awareness inside the client, however, we want to do it in a DRY way. Adding another trait called
ReplicaAwarenessallowed us to do just that.
I have to say that in that particular case, rather than implementing
execute in an alternative way, we decided to change the client and have replica awareness inside a
DefaultHttpClient subclass returned by the
make_client call on
BlockingHttp subtypes. That doesn't mean it couldn't or shouldn't have been done differently; it just reflects our understanding of Dispatch at that time.
- Detailed error logging: the default implementation of the
StatusCodeexceptions in case the status code is different than what your calls expect. Unfortunately, these
StatusCodeexceptions don't carry a lot of detailed information on the original request, which means our logging missed on important details. By implementing another trait
DetailedStatusCodeReportingand overide the definition of the
executeoperation, we were able to replace the
StatusCodeexception with a more detailed version of that exception, carrying information on the request that failed.
In summary, we could eventually implement all of the additional executor behaviour we needed by having a trait override the definition of
Let me know if you found this explanation useful. Some of what I wrote here will end up in a new edition of the little Dispatch book I wrote. Let me know what you're missing, and I will stick that in as well.
(Expect a more detailed explanation of the retry policy, failover and detailed error logging traits in some future blog posts.)