Cache Synchronization With Jabber
Yesterday afternoon Suneth and I took on a research project to see how feasible it was to keep server caches up to date by using XMPP to notify the other servers in the cluster of a change. Imagine a web server with some latency between it and the resources it's serving (eg: it's using S3), to speed up performance you'd want to cache the recently used or most commonly used resources locally on the server, but if you need to scale up to a cluster of servers and the resources are being changed, that cache becomes a problem.
The simplest solution to this is to set limits on the length of time that resources can be cached but I'd prefer to avoid latency in updates as much as possible.
The other alternative is to have some kind of notification between the server cluster so that all servers rapidly find out when a change has been made – much like database replication. This is what Suneth and I wanted to play around with and we thought XMPP would be a quick way to get such a notification system going.
It's worth noting neither of us really knew anything about XMPP other than that it was the protocol underlying Jabber instant messaging and it was a generic XML messaging system. We figured we could use the generic nature of it to send whatever messages we needed and just take advantage of the existing servers to do the marshalling. Fortunately, our lack of knowledge led us right into a much simpler solution – just use Jabber.
We grabbed the Smack API and started playing with it and quickly discovered that sending and receiving messages was ridiculously easy. It turns out that the absolute simplest way you can minimize stale data in your caches is to simply have all the servers join a preconfigured chat room. Whenever they save a change to a resource they send a message to the room with the unique ID of that resource and whenever they receive a message from the room they assume it's a unique ID and remove any cached versions of that resource.
It also makes testing easy – you simply open a jabber client and join the chat room yourself.

September 8th, 2007 at 11:05 pm
Thought about that too some time ago vut didn’t carry through. What server do you use? Does it scale to large sites with many cache writes?
Peace
-stephan
–
Stephan Schmidt :: stephan@reposita.org
Reposita Open Source – Monitor your software development
http://www.reposita.org
Blog at http://stephan.reposita.org – No signal. No noise.
September 9th, 2007 at 7:37 am
And what about JMS ?
http://www.enterpriseintegrationpatterns.com/ObserverJmsExample.html
September 9th, 2007 at 7:41 am
Stephan,
We used OpenFire (the opensource version of WildFire), mostly because we’re developing a Java webapp and we already have an instance set up internally anyway. I honestly have no idea how well it would scale, I imagine there’s a limit there somewhere but if you can keep the messages simple enough and keep it simple to remove things from cache it should scale a long way. For this particular app I imagine we’ll need to first split the business and view logic out to it’s own server and have the storage layer on a dedicated box (like most database setups are done), we can then have multiple front ends pulling from the one storage box. Later we’ll need multiple storage boxes and this synchronization layer. Having separated the storage component out first should mean fewer instances of the storage server and thus less strain on the jabber connections.
Naturally there’s a lot of profiling to be done in all this – we were just experimenting with something that seemed cool.
September 9th, 2007 at 7:57 am
Fanck,
JMS is probably a very good solution too, in fact it’s probably better – but it’s just not as cool. :) We weren’t seriously adding this to our product, just experimenting. Thanks for pointing it out though, I’ll keep it in mind if we ever do need to implement something like this.
September 10th, 2007 at 4:56 pm
@Adrian, thanks, I think I’ll play with that idea too. (Because it looks good, easy and cool).
September 10th, 2007 at 6:44 pm
Oh and I should note, one other big advantage of Jabber is that it handles firewalls really well – assuming of course that the jabber server is on the open internet, you can have all your clients behind firewalls without problems. That’s particularly nice if like in our case you have an Australian office that doesn’t so much like waiting for light to travel across the pacific and back for every request. You can easily set up a local “mirror” which is actually completely up to date with the main server.
January 23rd, 2009 at 2:29 am
Interesting but I just wonder after reading this everything will be best done using XMPP only. Why not apply that in every thing we r using today from mail protocol to replication methods.
Or who knows this is the future ;)