Amazon EC2 As A Webhost

We need to move our company wiki and JIRA instance to a server with more RAM and CPU to spare as they're pretty slow on the current overloaded virtual server, so we've been looking at a few different options. One that came up was using Amazon's EC2 and S3 services. We knew straight off that we didn't need the scalability they offered but getting some experience using them could be beneficial and we really didn't know anything about what they actually offered so it was worth a quick look.

Those familiar with EC2 won't be surprised to hear that we won't be going with the service for three reasons:

  1. It's at least as expensive as the dedicated server we'd need.
  2. The filesystem gets reset everytime the server reboots (S3 provides a REST API to store and retrieve data, not a filesystem)
  3. The server gets a new IP address every time it reboots.

The cost only applies to us because we don't need scalability – our needs are really quite consistent so we're not avoiding purchasing large amounts of redundant hardware. We also have the ability to just pay a hosting company to set up one dedicated server for us instead of setting up our own server farm. If you were offering software as a service however, Amazon's offering is likely to save you a lot of money.

The filesystem resetting is a challenge for deploying most existing web applications, but not for software designed to run with S3. For instance, it's pretty easy to imagine a wiki implementation that uses S3 as it's "database" for storing data directly (probably with some local caching etc). Wikis are somewhat ideal for this because search is about the only query you perform on the data – otherwise you just retrieve pages by name which S3 is perfectly suited for. The fact that so many wikis use flat files instead of databases is an indication that it'd work pretty well. There would be a few hurdles but nothing insurmountable.

The dynamic IP however is a real pain. There are examples of using dynamic DNS to work around it but the lag in DNS updates seems like a problem to me. The better solution would of course be to have a load balancer in front of your EC2 instances – the load balancer would have a static IP address and the EC2 instances would just register with it when they start up. Unfortunately this means you have to have a server outside of EC2 to do the load balancing which means another hosting provider to work with and it just seems odd to have the load balancing server in a different data center to the rest of the servers. If Amazon added an option to build an EC2 machine that could only ever have one instance but had a guaranteed IP address it would be the perfect solution.

It's certainly something interesting to play with – I'll have to chase up a corporate credit card and see if I can get access to do some experimentation some time.

14 Responses to “Amazon EC2 As A Webhost”

  1. Matt Brubeck Says:

    Do you mean EC2?


  2. Adrian Sutton Says:

    ah, yes. sorry.


  3. Geert Bevin Says:

    I’m curious which dedicated hosting company you’re using. I can’t find any that offers the same server hardware features together with the same backup facilities as what you get with EC2/S3, for a comparable price. I’ve been using EC2 as a webhost for several months now and it works great. Using dnsmadeeasy.com for the dynamic DNS works well. I never reboot the server though, so the dynamic IP address really isn’t an issue, neither is the the filesystem reset. You’re wrong about this BTW, the file system is only reset when your machine is shut down, not when you restart it.


  4. Steve Loughran Says:

    For a single instance, it doesnt make sense…I went with Rimuhosting instead. Cost plus a persistent filesystem. The selling point of EC2 is if you need 10-50 boxes on demand, and are prepared to re-engineer your back end so it persists to S3. You also need some way of routing workload to the farm of boxes.

    As Geert Bevin points out, things are a bit more stable than you think, but there is nothing in the SLA that says ‘we cant drop a live image in an emergency’; a h/w failure can trigger that. Only S3 has availability.


  5. Adrian Sutton Says:

    Geert,
    S3 provides excellent data reliability, but EC2 doesn’t and you need to implement your own backup procedures for your data in EC2. For EC2 it’s even worse than normal because any hardware problem which causes your node to restart also deletes all your data – normally you need a hard drive fault to get that problem. Oh and my use of restart is probably inaccurate – I’m thinking of when the server crashes for some reason, with Amazon you couldn’t depend on it keeping your data around. It is good to know that it generally does keep the transient data around though. It does help to relieve my concern about DNS update lags since more stable servers mean fewer DNS changes so it’s less likely to be an issue. It could still add a painful amount of downtime to any hardware faults though.

    I did get the impression that instances tend to keep running without problems for long periods but there’s certainly no way you could count on an EC2 instance preserving your data.

    As Steve said, and as I was trying to suggest – EC2 and E3 are a very powerful combination for handling things that may need to scale rapidly or have to deal with spikes in usage so there’s a lot going for them. I suspect there would be a lot of interest in EC2 have a load balancer module for web apps given the number of people that are using it for that. If nothing else, it’s just good to have that technology in one place.

    Anyway, it will be interesting to actually play around with it when I find time (and that corporate credit card….)


  6. Geert Bevin Says:

    Adrian, in those price ranges, I know of no web hosting company that gives you any guarantees wrt the protection of your data against hardware faults. Unless you go for managed dedicated hosting with a hardware SLA (which easily costs you 10 times as much), you still have to figure out a backup plan in case things go wrong. Thinking you don’t have to and just trusting commodity hardware is kidding yourself. I’ve had far too many hard drives fail to be trusting them with my data, even if they keep it. As it turns out, backing up to S3 is by far the easiest and cheapest backup procedure I’ve worked with. Also over the last 6 months, EC2 has been rock solid, as least as stable as regular dedicated hosts that costed me *a lot* more.


  7. Adrian Sutton Says:

    Geert,
    You slightly miss understand. The recovery mechanism for SC2 in the event of *any* hardware failure is to reset your filesystem and reboot on another node. Most hosting providers will give you backup space – manual for unmanaged hosts and automated for managed hosts. If you combine that with a regular download of your important data so that it is completely off-site and completely in your control, you have quite a good back up strategy. It’s not as good as using S3 but you always have the option of doing backups to S3 as well if you wanted.

    With EC2 though, you are far more likely to need to use your backups which makes the currency of your backups far more critical. If you occasionally have to restore from back up it’s ok to lose even a days worth of data but if you expect to regularly restore from backup you need to your backups to happen at least every hour which not all systems can handle. I’m not saying that EC2 and S3 don’t provide better backup facilities, just that adequate backup is available from most hosting providers and is easier to set up.

    If however you have a system that you can tailor to use S3 for it’s persistent storage, EC2 and S3 are pretty clearly an awesome proposition (but you still need to work around the dynamic IP one way or another).


  8. Geert Bevin Says:

    Adrian, I’m telling you as a user of EC2 for web hosting, I never had to restore anything in over 6 months of usage. The only time I did was to see if me incremental backup strategy worked. I advocate that it is much easier to cover from any system failure using EC2. I have hourly + daily incremental and weekly full backups of my data and I make a full live image of the running operating system (excluding any of the data) as an AMI (EC2 image). Whenever anything goes wrong, I just boot up another instance of the image and have a system up and running in seconds. Transferring the data backups from S3 is amazingly fast from EC2. So restoring the data takes from 10 mins to 30 mins depending on where I am in the week (ie. how many incremental backups I need to apply). Then again, I never had to resort to this since the service has just been so stable. I do feel much more protected than ever before since I know that anything can happen, I’ll always be able to get a new machine up and running in 30 mins, worst case scenario. If I need more guarantees than that, I’d just setup a second EC2 instance as a mirror that’s ready to take over in a matter of seconds, but in my case this isn’t needed.

    Btw, you do realize that EC2 doesn’t run on the crappy commodity hardware you get from ‘cheap’ dedicated hosts? They are virtual hosts running on a cluster of commodity hardware that’s designed for failure. They expect individual machines to fail regularly, without virtual instances having to go down. Again, to me that feels much more secure.


  9. Adrian Sutton Says:

    hmmm, interesting. I don’t know if I’d ever feel comfortable with that but it sounds like I probably should. As I’ve said, I’m keen to play around with the service and find out what it’s actually like which should help a fair bit. Right now I’m running on essentially just initial perceptions which is never a particularly solid base.


  10. Symphonious » Hosting on Amazon EC2 Says:

    [...] done a fair bit more investigation into using EC2 for web hosting and it seems to be something that people do with a fair bit of success. In addition to Geert who [...]


  11. DIGITALISTIC » Blog Archive » links for 2007-09-25 Says:

    [...] Amazon EC2 As A Webhost (tags: amazon ec2 hosting scaling webhost) [...]


  12. Ratnesh Kumar Deepak Says:

    @Adrian Sutton.

    I’m using GoDaddy dedicated server for past 2 years for my corporate clients and recently started to think of using EC2 for expansion reasons. Can you please write down your experience in EC2 experimentation and how do you find this service if you are using it.


  13. Nicholas Sherlock Says:

    Just like to point out for any future Googlers: Problems 2 and 3 can now be mitigated with new Amazon features. EBS (Elastic Block Store) is a block device that you can attach and detach from instances, the data stored there is persistent. You don’t need to worry about the complexity of using S3.

    There are now flexible IP addresses – you can reserve an IP address which you get forever, and can instantly route traffic going to that IP to any instance you choose, so you can maintain the same IP address to the world across instance reboots (or changes of instances!).


  14. Thomas Mon Says:

    Thank you for this great information, we are looking to migrate from a grid computing service that just seems to be going down hill faster and faster as the months go by to S3 + EC2. I found your discussions here very informative! Luckily I have the corporate CC already and will be playing with it as soon as I can!


Leave a Reply

(Valid OpenIDs will skip moderation)

Alternatively, subscribe to the Atom feed.