The Missing Link

Thursday May 9, 2013

The Missing Link

Monkeyman with Amazon S3

File name extension. This is a very common one. "cgi", even ".html" is something which will change. You may not be using HTML for that page in 20 years time, but you might want today's links to it to still be valid. The canonical way of making links to the W3C site doesn't use the extension.
-- Cool URIs don't change

I create my blog using Monkeyman. Or should I say I create Monkeyman using my blog. Too hard to tell. In any case, once everything is getting generated, I always had to manually upload it to S3. I used s3sync which I think is no longer being maintained, and it was working alright, with one exception: every file name needed to have the extension associated to its mime type.

Annoying

That has been annoying me for quite a while. Say for instance this post. In the source Markdown file, I set the title to this:

# The Missing Link

As a result of that, the transformed file name would have become the-missing-link.html, a nice and descriptive URL. However, I would have prefered to have a name without the .html part. Unfortunately, with the tools I was using, the only way to set the mime type for the file being sent to S3 was to use the extension. There was no alternative way to set the mime type. And if you don't set the mime type, then it will never serve the file with the proper Content-Type header, which then will cause the browser to be confused over what you're actually feeding it.

How to solve it?

I added a new option --omit-html-suffix that will drop the extension of generated HTML files. So if you start the Monkeyman server like this:

monkeyman server --omit-html-suffix

… it will offer this page through http://localhost:4567/the-missing-link and not include the extension. And if you would run monkeyman generate with the same option, you obviously would get the file without the extension.

However, that only solves one part of the problem. The next problem is: how do I get on S3 with a proper mime type, with the extension now being gone.

I looked at this a number of times. One of the things I could have done is just continue to use the old mechanism I had for uploading to s3 (using s3sync) and then have Monkeyman post process the uploaded objects and set the mime type correctly. It might have been efficient, but the user experience would be pretty lame.

So instead of that, I decided to bring upload to S3 into Monkeyman. If you specify an alternative target location like this:

-o s3:[access key]:[secret key]:[bucket]

Then all files will be uploaded to S3, to the given bucket and the mime type and permissions will be set correctly.

Current state

This version of Monkeyman hasn't been merged into a release yet. I did merge this feature into the develop branch though, so if you feel like it, it's ready for taking a spin. And – as you can tell by the URL of this page – I also updated my blog to finally have the links that I was dearly missing for such a long time.

Disclaimer

It works, but it's not fast. Nothing is optimized in Monkeyman, and in this area it's most noticable. One of the things that I could have done to prevent Monkeyman from continously uploading the same file is checking the md5 hashes to see if the files actually changed, but that's something that will have to wait for some other time.