Symphonious

Living in a state of accord.

Cache Synchronization With Jabber

Yesterday afternoon Suneth and I took on a research project to see how feasible it was to keep server caches up to date by using XMPP to notify the other servers in the cluster of a change. Imagine a web server with some latency between it and the resources it's serving (eg: it's using S3), to speed up performance you'd want to cache the recently used or most commonly used resources locally on the server, but if you need to scale up to a cluster of servers and the resources are being changed, that cache becomes a problem.

The simplest solution to this is to set limits on the length of time that resources can be cached but I'd prefer to avoid latency in updates as much as possible.

The other alternative is to have some kind of notification between the server cluster so that all servers rapidly find out when a change has been made – much like database replication. This is what Suneth and I wanted to play around with and we thought XMPP would be a quick way to get such a notification system going.

It's worth noting neither of us really knew anything about XMPP other than that it was the protocol underlying Jabber instant messaging and it was a generic XML messaging system. We figured we could use the generic nature of it to send whatever messages we needed and just take advantage of the existing servers to do the marshalling. Fortunately, our lack of knowledge led us right into a much simpler solution – just use Jabber.

We grabbed the Smack API and started playing with it and quickly discovered that sending and receiving messages was ridiculously easy. It turns out that the absolute simplest way you can minimize stale data in your caches is to simply have all the servers join a preconfigured chat room. Whenever they save a change to a resource they send a message to the room with the unique ID of that resource and whenever they receive a message from the room they assume it's a unique ID and remove any cached versions of that resource.

It also makes testing easy – you simply open a jabber client and join the chat room yourself.

  • Stephan Schmidt says:

    Thought about that too some time ago vut didn’t carry through. What server do you use? Does it scale to large sites with many cache writes?

    Peace
    -stephan


    Stephan Schmidt :: stephan@reposita.org
    Reposita Open Source – Monitor your software development
    http://www.reposita.org
    Blog at http://stephan.reposita.org – No signal. No noise.

    September 8, 2007 at 11:05 pm
  • Franck says:

    And what about JMS ?

    http://www.enterpriseintegrationpatterns.com/ObserverJmsExample.html

    September 9, 2007 at 7:37 am
  • Adrian Sutton says:

    Stephan,
    We used OpenFire (the opensource version of WildFire), mostly because we’re developing a Java webapp and we already have an instance set up internally anyway. I honestly have no idea how well it would scale, I imagine there’s a limit there somewhere but if you can keep the messages simple enough and keep it simple to remove things from cache it should scale a long way. For this particular app I imagine we’ll need to first split the business and view logic out to it’s own server and have the storage layer on a dedicated box (like most database setups are done), we can then have multiple front ends pulling from the one storage box. Later we’ll need multiple storage boxes and this synchronization layer. Having separated the storage component out first should mean fewer instances of the storage server and thus less strain on the jabber connections.

    Naturally there’s a lot of profiling to be done in all this – we were just experimenting with something that seemed cool.

    September 9, 2007 at 7:41 am
  • Adrian Sutton says:

    Fanck,
    JMS is probably a very good solution too, in fact it’s probably better – but it’s just not as cool. :) We weren’t seriously adding this to our product, just experimenting. Thanks for pointing it out though, I’ll keep it in mind if we ever do need to implement something like this.

    September 9, 2007 at 7:57 am
  • Stephan Schmidt says:

    @Adrian, thanks, I think I’ll play with that idea too. (Because it looks good, easy and cool).

    September 10, 2007 at 4:56 pm
  • Adrian Sutton says:

    Oh and I should note, one other big advantage of Jabber is that it handles firewalls really well – assuming of course that the jabber server is on the open internet, you can have all your clients behind firewalls without problems. That’s particularly nice if like in our case you have an Australian office that doesn’t so much like waiting for light to travel across the pacific and back for every request. You can easily set up a local “mirror” which is actually completely up to date with the main server.

    September 10, 2007 at 6:44 pm
  • Abhinav Singh says:

    Interesting but I just wonder after reading this everything will be best done using XMPP only. Why not apply that in every thing we r using today from mail protocol to replication methods.

    Or who knows this is the future ;)

    January 23, 2009 at 2:29 am

Your email address will not be published. Required fields are marked *

*