Living in a state of accord.

Automation and Selecting Web Hosts

One of the biggest challenges with selecting a web host is that it’s very difficult to determine the quality of a provider without actually setting everything up and seeing how it goes for a while. Generally that means that you either wind up avoiding the very low cost providers out of fear they won’t be reliable and possibly paying too much, or spending a lot of time and effort setting up with a provider only to then discover they’re unreliable or the performance isn’t as good as you expected.

Virtual private servers are a particular problem because hosting providers advertise specs like CPU cores and RAM, but don’t advertise whether those CPUs are oversubscribed or what the contention on disk access is.

Fortunately, now that tools like puppet and chef have appeared and matured, its quite easy to automate the setup and configuration of a server so you can easily switch hosts if needed. Even more so with web hosting companies more and more becoming cloud providers – the difference primarily being per-hour pricing.

Which is a round about way of saying – welcome to the new symphonious, completely re-homed and rebuilt using automated scripts. Hopefully you didn’t notice the difference. The move saves me about $20/month and because the entire build was automated if I have issues here or find a better deal elsewhere I can now move effortlessly.

Default to Development Settings

Most systems have multiple configuration profiles – one for production, one for development and often other profiles for staging, testing etc.  Minimising differences between these configurations is critical but there are inevitably some things that just have to be different. This then leaves the question, what should the default settings be?

There are three main options:

  • Default to production values
  • Default to development values
  • Refuse to start unless an environment has been explicitly chosen

My preference is to default to development values. Why?

Development values should be “safe” in terms of any external integrations. So a developer isn’t going to accidentally start sending real buy or sell instructions to your stock broker.

There are more developers than production systems. If you default to production systems, every developer needs to remember to switch to development mode whenever they setup a new checkout. Defaulting to development mode means it just works for the most common case.

Checking authentication credentials for external systems into your source control system is generally considered a bad security practice, so the default values are unlikely to actually work in production anyway.

The down side with defaulting to development is it’s possible to accidentally deploy to production with development values causing an outage. This can be pretty easily prevented with automated deployments or using tools like RPM where files can be marked as config and thus avoid overwriting them when doing updates.

Refusing to start is the worst of all worlds – every developer has to specify a configuration mode and you still risk production outages by not specifying a mode.

Amazon EC2 As A Webhost Redux

Back in 2007 I looked at EC2 for a web server and while it wound up being feasible it had a number of drawbacks:

Those familiar with EC2 won't be surprised to hear that we won't be going with the service for three reasons:

  1. It's at least as expensive as the dedicated server we'd need.
  2. The filesystem gets reset everytime the server reboots (S3 provides a REST API to store and retrieve data, not a filesystem)
  3. The server gets a new IP address every time it reboots.

Since then Amazon have rolled out new services that solve problems 2 and 3 and reserved instances to help with 1. What surprises me after a couple of years running a single EC2 instance with an app that’s using S3 for storage though is just how stable it has been.

Remember that EC2’s original point in life was scalability, not running one single instance for a really long time. They’ve done tons of upgrades to their infrastructure over the last couple of years as well which normally would mean down time and migrations. You can imagine my surprise when I checked how long it’s been since the instance rebooted:

e2wiki:~# uptime
 04:28:45 up 499 days, 19:14,  1 user,  load average: 0.33, 0.10, 0.03

Just shy of 500 days since a reboot for any reason. I can’t say that about any other hosting service I’ve ever used so even if EC2 is more expensive, it’s seriously reliable.

Now we just need to fix the memory leak in the app we’re running on that server – it up and dies a couple of times a week. That said, the script that automatically restarts it is so effective that the external monitoring tools don’t even notice so it’s probably not worth the effort.

I Hate Deployment

Deployment ruins everything. So many cool technologies that let you develop more rapidly and do awesomely cool stuff fall down at the last hurdle of deployment. Sometimes it’s because they haven’t thought it through properly so it’s just plain too hard, but often it’s just that it’s too hard to convince people that it won’t be another headache for them.

The latest in my deployment-caused frustrations is CouchDB. I have a few use cases that I think CouchDB would be perfect for and it would save me heaps of development effort and headaches. The trouble is, while CouchDB may be of the web, it really isn’t of the enterprise IT architecture.

That’s not to say it wouldn’t fit in perfectly well. It’s not to say there’s anything wrong with CouchDB. It doesn’t even mean that it would be hard to deploy or hard to maintain or anything to do with CouchDB at all. It’s just not part of the enterprise plan for “stuff we put on our servers”. Database stuff goes in Oracle or DB2 and we really don’t need a second database instance running. The fact that CouchDB is an entirely different type of database and has completely different strengths and weaknesses making it perfect for this particular use case doesn’t get a hearing.

When you have big enterprise as your clients, the cool toys always seem just out of reach1.

I’d wish that I worked in an environment that was more relaxed and we could deploy tons of different systems based on what was the best fit for this particular job, except that I have that problem within Ephox and it’s not so much fun either.

1 – like Java versions above 1.4…

200 Means OK!

While many web visionaries are busy advocating the correct use of ETags and URIs etc etc, I just wish people could get the very basics of HTTP right.  I’m not even talking about mime-types here, just status codes would be a really good place to start.

If you’re returning the page as requested, use 200.

If you’re returning a server error, use 500.

If the requested page doesn’t exist, use 404.

If you can follow even just those three rules, you’ll make life so much easier for user agents. Anything about that is great too, but whatever you do, please get these basics right.

Epic fail to IIS and IBM Portal on these points.  Something is seriously wrong with the internet…