The Problem With Atom

I’ve always liked the Atom spec. It’s neat and tidy with strict rules about what’s valid and what’s not with all those rough corners and incompatibilities of RSS sorted out (well, mostly). If I run into one of the silly sites that offer both RSS and Atom I pick Atom just because it feels right even though both would work perfectly well for me. So it came as quite a surprise to me to discover a major weakness in the Atom spec - it’s a right pain to generate. Let me explain…

For various reasons, I’ve spent a fair bit of time over the last few months converting LiveWorks! from being hosted on WordPress to being a “real” website running on Lotus Web Content Management (IWWCM - yes I know the acronym doesn’t match the current name).

Anyway, one of the things that’s baked into WordPress because of it’s blogging culture is the concepts of RSS/Atom feeds. IWWCM comes from an entirely different world so the concept of feeds is quite foreign to it. Fortunately, it’s a very flexible system and has some simple tools you can throw at the problem. All you do is use a “menu”, which allows you select content from the repository according to certain criteria and order them any way you want. In this case, we simply order by published date. Menus in IWWCM have a head, a bit that’s repeated for each item and a footer. So you throw all the XML from before the first item in the head, a template for each item in the repeated bit and close off any remaining tags in the footer.  Simple enough.

The problem is, Atom doesn’t follow that structure. Atom includes one key piece of the content items below in the head: the most recent update time. In order to generate that head section of the XML, you have to have the meta data about the most recent item in the system. In most cases that’s fine and having that element makes parsing an awful lot simpler, but it’s a surprisingly annoying requirement. The IWWCM menu structure, simply can’t handle this and so can’t generate a valid Atom feed.

RSS on the other hand wasn’t as nicely designed to make detecting changes simple. It doesn’t have to have that updated element in the head section and so it’s perfectly suited to the IWWCM menu structure. So, LiveWorks! has now switched over to RSS 2.0 and apart from an implausible date due to some weird leaps around time zones and an article publishing a day early, it’s feed validates.

This isn’t the first time I’ve cursed Atom for this either - at least for me, it just seems so natural to follow the simple iterator pattern that IWWCM’s menus use so I’ve run into this a few times. Mostly it just takes an extra if statement or similar to special case the first item but every so often it requires some major reworking and in cases like this, it’s just about impossible to do.

That’s trade offs for you though. If that updated element wasn’t required, there’s a whole bunch of cool stuff on the consumer side that wouldn’t work. Oh well.

6 Responses to “The Problem With Atom”

  1. Sam Ruby Says:

    There would not be anything invalid about producing an Atom feed where feed/updated is always time.now (as in, the time the feed was produced).

    A possibly related question: does IWWCM produce ETags and/or Last-Modifed dates, or does it recompute the entire feed on every request?


  2. Adrian Sutton Says:

    Yeah time.now isn’t available either but that’s good to know for most other situations. I’m sure I could do some more Apache regex foo as well but RSS 2 is simpler and works just as well for this use case.

    IWWCM has a whole swag of caching stuff that I haven’t delved into yet but by default it recomputes the feed every time. The feed IWWCM generates is only fed into FeedBurner and FeedBurner is smart enough to do it’s own caching etc so the impact is minimal. The bigger problem is the lack of caching on a number of resources like JavaScript files etc - at least the theme images are served directly from Apache so are fairly cacheable. Overall though the site is a pretty spectacular example of bad caching practices - just like it was when it was running on WordPress.


  3. Asbjørn Ulsberg Says:

    How can the IWWCM menu structure not handle the “updated” element? Can’t you drive it through XSLT or otherwise manipulate the XML structure somehow? The “updated” element is there, as you write yourself, for easy consumption.


  4. Adrian Sutton Says:

    You could post process it using a reverse proxy server type set etc, but IWWCM itself literally has 3 fields: start, item and end. You can only refer to specifically named content items in the start and end sections and the item section is repeated for every content item that gets output. You can generate the updated element for each item because in that field you have access to the current item, but you don’t have access to the first item in the start field.

    You also can’t create the updated element in the item field because then every content item would generate it, giving you something like:

    I think you could do it by adding a second menu that only ever output 1 item (the most recent) and you could include that menu by name in the start section. With the way IWWCM iterates through content to find items to include in menus though that would be almost equivalent to generating the entire feed twice for each request. It’s also way more complex that you’d want just so that you can have an updated element.

    Oh and there’s no XML structure in IWWCM - you’re just outputting text that happens to be XML when it all comes out. IWWCM doesn’t know it’s generating XML - and in fact it declares the content type as text/html which is rather unfortunate.


  5. Asbjørn Ulsberg Says:

    You make me feel like the happiest man alive for not having to poke around with IWWCM. Sounds like a pretty horrifying CMS to me. It sure seems enterprisey, though. If that’s a positive or negative label is within the eye of the beholder I guess. Personally, I hate everything purporting itself as “Enterprise Software”. Which is why I’d choose WordPress (or Drupal) any day over multi-million dollar solutions. At least they’re FOSS, so if they don’t match my needs exactly, I can tweak and bend them in any direction and way I want.


  6. Adrian Sutton Says:

    Having used both WordPress and WCM for the same site, there are trade offs both ways. WCM doesn’t have feeds baked in, but WordPress lacks workflow. WCM is difficult to set up but extremely flexible, WordPress is fast to get up and running but then took as much effort as WCM in the end to get some of the elements of the site and they were far less maintainable. I’d have the same horror story for you about getting dynamic side bar menus in WordPress or the plugins listing on the front page, or the changelog on the early access stuff. I’ve essentially had to develop plugins and custom code for both WCM and WordPress.

    Drupal is probably a good middle ground and is likely to be the ideal type of system for LiveWorks! but we do a huge amount of partnering with IBM and want to dog food our integration with their products. Besides which, I’ll sound a lot more intelligent talking to WCM clients now that I know how the whole system works in the real world, not just the editor bit of it.


Leave a Reply

Alternatively, subscribe to the Atom feed.