Below you will find pages that utilize the taxonomy term “System Administration”
Automation and Selecting Web Hosts
One of the biggest challenges with selecting a web host is that it’s very difficult to determine the quality of a provider without actually setting everything up and seeing how it goes for a while. Generally that means that you either wind up avoiding the very low cost providers out of fear they won’t be reliable and possibly paying too much, or spending a lot of time and effort setting up with a provider only to then discover they’re unreliable or the performance isn’t as good as you expected.
Default to Development Settings
Most systems have multiple configuration profiles – one for production, one for development and often other profiles for staging, testing etc. Minimising differences between these configurations is critical but there are inevitably some things that just have to be different. This then leaves the question, what should the default settings be?
There are three main options:
- Default to production values
- Default to development values
- Refuse to start unless an environment has been explicitly chosen
My preference is to default to development values. Why?
Amazon EC2 As A Webhost Redux
Back in 2007 I looked at EC2 for a web server and while it wound up being feasible it had a number of drawbacks:
Those familiar with EC2 won’t be surprised to hear that we won’t be going with the service for three reasons:
- It’s at least as expensive as the dedicated server we’d need.
- The filesystem gets reset everytime the server reboots (S3 provides a REST API to store and retrieve data, not a filesystem)
- The server gets a new IP address every time it reboots.
Since then Amazon have rolled out new services that solve problems 2 and 3 and reserved instances to help with 1. What surprises me after a couple of years running a single EC2 instance with an app that’s using S3 for storage though is just how stable it has been.
I Hate Deployment
Deployment ruins everything. So many cool technologies that let you develop more rapidly and do awesomely cool stuff fall down at the last hurdle of deployment. Sometimes it’s because they haven’t thought it through properly so it’s just plain too hard, but often it’s just that it’s too hard to convince people that it won’t be another headache for them.
The latest in my deployment-caused frustrations is CouchDB. I have a few use cases that I think CouchDB would be perfect for and it would save me heaps of development effort and headaches. The trouble is, while CouchDB may be of the web, it really isn’t of the enterprise IT architecture.
200 Means OK!
While many web visionaries are busy advocating the correct use of ETags and URIs etc etc, I just wish people could get the very basics of HTTP right. I’m not even talking about mime-types here, just status codes would be a really good place to start.
If you’re returning the page as requested, use 200.
If you’re returning a server error, use 500.
If the requested page doesn’t exist, use 404.
I Love mod_proxy
After my amazingly successful use of mod_proxy to provide clean URLs in an IWWCM instance, it’s been added to my bag of useful tricks to know about. When you realize you can proxy differently based on the current virtual host it’s a very powerful solution.
My latest use for it was to add name based virtual host support to two completely separate virtual machines. One machine runs IBM WCM and the other runs Quickr. Both use the same port, and in the future there will be more VMs with different versions as well, so while it would be possible to assign different port numbers, I’d prefer to not have to remember which VM is using which port etc. The firewall however can only forward connections on a given port to one VM.
Exporting and Importing a Portal WCM Library
I’m going to need this soon and I’ll never find the link again in the IBM forums so I’m putting it here.
Exporting and Importing a Web Content Library
It should let you move web content (minus drafts and previous versions unfortunately) from one IWWCM server to another.
Ant SCP/SSH Task Hangs Or Never Disconnects
If you’re using the scp or ssh tasks with ant, you may run into a problem where part way during the upload or never disconnecting after the command completes for the ssh task. There are a couple of possible causes:
- The scp problem is almost certainly caused by using ant 1.7.0 or below and jsch 0.1.30 or above. You could upgrade to the latest nightly of ant1 but it’s probably easier to just drop back to jsch 0.1.29 which is what ant was developed against and works nicely. Bug 41090 contains the gory details.
- If the command you’re executing with the ssh task starts a background service or otherwise leaves a process running, that may be the cause of the problem. You can add ‘shopt -s huponexit’ to your /etc/profile, .bashrc or somewhere like that. I must admit, I’m somewhat vague on the exact details of what that does but the basic idea seems to be to signal any background processes that bash is exiting and then not wait for them to complete (which allows your ssh connection to close). If you’re starting a server they’ll probably ignore the hup signal it sends and if not, use the nohup command.
Hopefully that will be the last I’ll see of that issue.
Tomcat Startup Issues
I was so close to having everything working… EC2, S3, automatically pulling down the latest build and deploying it, Tomcat 5.5 with the native APR libraries, SSL support and using iptables to forward ports 80 and 443 directly over to Tomcat. Everything ready to go. Except Tomcat isn’t so keen on starting.
It usually starts, though it can take over half an hour to do so and on a couple of occasions it’s just flat out sat there and done nothing for multiple hours on end. At startup it outputs the log message:
Hosting on Amazon EC2
I’ve done a fair bit more investigation into using EC2 for web hosting and it seems to be something that people do with a fair bit of success. In addition to Geert who commented on my last post and who’s site rifers.org is hosted directly on EC2, there’s also hanzoweb.com and www.gumiyo.com all of which just have their DNS pointing at an EC2 instance.
I still wish Amazon had a preconfigured solution that acted as the web front end and load balancer with a static IP, but it appears that it’s quite feasible to just point your DNS at EC2 and your server seems to stay put.
Solr Search Index Backups?
If you have a massive set of documents that you’re using Solr to search (let’s say a few million HTML pages) how much should you worry about losing the search index?
It is of course always possible to reindex the original documents, but that would take a fair while, so should you keep a backup of the search index? If you restored the backup, how would you identify which documents needed updating?
Amazon EC2 As A Webhost
We need to move our company wiki and JIRA instance to a server with more RAM and CPU to spare as they’re pretty slow on the current overloaded virtual server, so we’ve been looking at a few different options. One that came up was using Amazon’s EC2 and S3 services. We knew straight off that we didn’t need the scalability they offered but getting some experience using them could be beneficial and we really didn’t know anything about what they actually offered so it was worth a quick look.
Server Problems Here And With Some Ephox Sites
In case people are wondering, there was a major failure at our hosting provider which is causing down time. Both this server and the server that hosts the Ephox release blog, LiveWorks!, people.ephox.com and the internal Ephox wiki and JIRA installations have been affected. While (obviously) this site is back up, the Ephox sites didn’t fair so well and are still down.
We’ll get them back up and running as soon as possible. In the mean time, if you see any problems here please let me know. For a short while after the system came back WordPress switched to the default theme (but with all other settings in tact) so I really don’t know what else might have been damaged.
Moving WordPress To A New URL
Every so often I want to play around with something new on my blog without trashing the public site. I have a local instance of WordPress that I do most of my playing around on but it generally doesn’t have the some database configuration as symphonious.net so I can’t be sure that things I develop there will work here.
It is however simple to completely clone a WordPress instance to a new URL – but I never remember precisely how, so this is a note to myself so that next time I’ll remember.
Server Down-Time
I felt brave this evening and upgraded this server to run Debian Etch since it’s been marked stable now. The upgrade was not without it’s flaws and the server experienced some down time so if you found the site unavailable in the past few hours, now you know why.
I still have no idea what kernel I’m supposed to use with the fancy virtualization – I suspect that it doesn’t really matter since the virtualization software seems to handle that level of things. Either way, the server rebooted and came back with everything working so I don’t really care.
Most Annoying Bug Ever
I’ve just spent the past three or four hours setting up Apache, Subversion, all my browsers etc etc to use SSL connections and client certificates for authentication with my Debian stable server. I’m sure the mod_ssl devs already know what’s coming here and are either chuckling gleefully or ripping their hair out right now. Anyway, the joke for all those who are mod_ssl devs, is that you can’t get subversion to use client certificates with a Debian stable server because Debian stable has Apache 2.0.54, complete with everybody’s favorite mod_ssl bug. It’s fixed in Apache 2.2, but not in 2.0.
Auto-Updating Systems via Subversion
One technique that I’ve started to use a lot around the different systems here is to store everything in subversion. It’s a pretty commonly recommended technique and the nicest benefit is that if your changes don’t do what you wanted you can easily roll back to an earlier version.
I’ve found though, that my favorite benefit is that it makes it easy to set up automatic updates for systems. Generally I just add an ‘svn update’ as the first step in running the appropriate system. When that’s not suitable, a simple cron job does the job just as well. For our cruise control server we actually have a “config” project which builds every 30 minutes and all it does is update cruise control’s config files.
Tomcat 4 and mod_jk
I’ve learnt way more about mod_jk in the last week than I ever wanted to know. Apparently the configuration is completely different between Debian 3.1 with Tomcat 4 and the current Debian testing with Tomcat 5 (point something). Why the mod_jk package doesn’t just do the configuration for you is beyond me, or at least have a debconf wizard to do it.
Anyway, with Tomcat 4, the magical directive that makes deciding what to delegate and when simple goes like:
Getting On Top Of Spam
I spent some time this afternoon trying to reduce the amount of spam that gets downloaded and dumped into my spam folder. Between SpamAssassin and Mail.app’s spam filters there’s basically no spam that makes it through to my inbox, but the sheer amount of spam being sent to symphonious.net, then downloaded from their to my home IMAP server before finally being processed by SpamAssassin is overwhelming and takes up a lot of bandwidth and processing time. Besides, with that much spam going into my spam folder I haven’t been bothered reviewing it so if there are any false positives they are pretty much doomed.