Ship Anytime - Is It Worth It?

May 15th, 2006

XP has the concept of keeping the code base in a state that you can ship at any time. That seems like a good idea as it allows you to quickly ship a new version to preempt a competitor's release or when marketing suddenly wants to attend a trade show and have something new to talk about. In other words, allow the business requirements to drive the release schedule instead of it being purely driven by the engineering team's schedule.

The problem though, is that some features require a lot of work to make them shippable while they are still in development - and they take a long time to develop so you want to gradually integrate them into the project. A typical solution is to create a branch in version control and develop the feature there, but that goes against the concept of continuous integration and causes significant pain when you try and merge the branches back together.

So you're left with a choice - either get the feature finished as quickly as possible, but not be able to ship while it's being developed, or potentially waste time keeping your code in a shippable state when your current plans don't require it to ship until after the feature will be complete. If we don't over-engineer our code by accommodating things we think we might require in the future, why should we over-engineer to accommodate maybe having to ship earlier than expected?

There is something of a middle road though. If you can identify what would be required to make the product shippable sooner, you can keep it as a backup plan. If for some reason you need to ship sooner than expected, you know you're 2 weeks work away from making this feature shippable - but at least you don't have to complete the entire feature which may take much longer. As long as you communicate that to the business, you can leave it up to the business to judge how important it is to be able to ship at the drop of a hat. Perhaps they will want you to complete part of the work to make it shippable now and leave part of it until it is actually required.

In the end it comes down to keeping your options open so that you can choose the path of most value when the time comes.

Pet Hate In Http Servers

May 13th, 2006

Pages created on .Mac however are the source of a never-ending headache. Indeed, whenever one requests a page on an account that no longer exists (such as the former .Mac FJZone account), the Apple servers dutifully serve a tri-lingual error page… all the while returning a “200 Found” code. In other words, as far as robots are concerned, .Mac pages live forever.

Francois Joseph de Kermadec

This has to be my single most hated server misconfiguration. The problem is much more serious than unwanted pages turning up in Google searches - any program that tries to download resources from the server without explicit user interaction gets bitten because the server delivers a 404 page instead of the expected file without warning. The client side program can then only assume the file is corrupt on the server and give up.

This bites us from time to time with our spelling definitions file which is downloaded from an URL specified in our configuration file. Being a configuration file, obviously it's possible for the system administrator to enter it incorrectly and we get a 404. Unfortunately, it's also possible for the system administrator to configure us to not check for updates (after all, how often do you change dictionaries?) so we get a corrupt file and are then told to never look to see if it changes because the system administrator has now put the file in the right place.

The other option is worse, we detect the file is corrupt and assume it was corrupted in transit, so we try downloading again - only to get another corrupt file. You can put an upper limit on the number of attempts but it would be so much easier if the server didn't lie to you.

The biggest culprit for this misconfiguration is IIS - anytime you set it up with an URL to use as a 404 page it sends it with a 200 OK message instead of a 404. Instead, you can select a local file (which can still be an ASP script etc) to use as your 404 page and it will correctly report a 404 error. I can only assume that it exhibits the same behavior if you set a custom URL for any of the other error codes.

The Challenge Of Intuitive WYSIWYG HTML

May 12th, 2006

I stumbled across the article This Is What You See, This Is What You Get the other day and it points out a number of common pitfalls for HTML editors that have relatively simple solutions, as well as repeating a number of common misconceptions about WYSIWYG editors - primarily that Word or Outlook should be considered good examples of how to do it.

Perhaps an obvious point. At least, the web is not WYSIWIG. What you see on your browser is almost certainly not what I see on mine due to many factors. Differing font sets, typographic capabilities of the OS, use of subpixel rendering, browser rendering engine/version, user display preferences such as screen resolution/depth, display gamma, as so on.

Actually, for most people things render pretty close to the same - most people don't notice any significant difference between a document that uses subpixel rendering and one that doesn't, screen resolution and display gamma are consistent enough as to not cause problems anyway. Besides which, none of these things are specific to the web, try transferring plain text documents between Windows and the Mac and you'll see the same difference, same with Word documents, pictures and pretty much any other type of file. The same problems occur when you try to print. The fact is whenever you change display devices you are going to see things slightly differently - heck the time of day and lighting conditions with the same device will cause differences. That doesn't mean you can't edit in a WYSIWYG editor and be satisfied with the results.

The fact is, WYSIWYG makes it easy to get close to what you wanted and that's close enough for most people. I edit all my posts for this blog in a WYSIWYG editor and view the result on many different devices, with many different browsers and operating systems and have never had a problem with the way it looks. If I had a reason to be exceptionally pedantic about the way things came out I wouldn't be using HTML.

However, consider the case where a user has their default typeface set to Arial Bold, and the author of a page has chosen to use the same typeface for emphasis. In this case the emphasised words are no longer visually distinct from the remainder of the body text, and hence the emphasis is removed, potentially changing the message.

The user could just as easily make this mistake without a WYSIWYG editor, and in fact they are just as likely to - most people tend to use the I tag instead of EM because they want italic and not emphasis. Any decent HTML editor will use (or at least have an option to use) EM when the user clicks the italic button, thus preserving intent and displaying correctly in nearly every situation.

This is why HTML emphasises structural markup. You, as an author of a web page, have to understand the difference between using a bold typeface and the correct markup for emphasised text: the former is one possible representation of the latter. Web authoring is not word processing. The more you make your web authoring environment look like a word processor, the more likely it is that users will treat it as such.

Web authoring is content creation and styling, just as word processing is content creation and styling. You'll find Word more pleasant to use if you use it's styles features instead of manually specifying the way you want things to look, similarly if you use the (CSS) styles features of any good HTML editor you'll find it's easier and you'll get semantic markup instead of mixing style and content together.

You may be thinking OK, so why don’t we get wise to this structural markup stuff, then adopt a visual editor (I’m going to shy away from the term WYSIWYG at this point) because although we understand the concepts, we still don’t like all the angle brackets. Can we justify a visual editor in this case? In other words an editor that allows manipulation of structural markup without requiring the user to delve into the markup language syntax; sometimes known as WYSIWYM. I am hopeful, and there are some promising developments, but I have yet to see it done.

This is in fact precisely what any good HTML editor does - it uses structural markup whenever possible and designs it's user interface so that applying the right mark-up is intuitive for the user. Even better, is with editors that allow you to remove UI elements that are display specific (things like the font face selector and font size selectors) and limit the options only to semantic markup.

It’s not an easy ask. The reason is that some markup elements affect the visual representation in subtle ways. Consider the difference between the
and

constructs in XHTML for example. Each is a good way of terminating a line of text, but it’s not always easy to see on the screen which is actually present, under the covers. A more obivous example might be the tag which has no visual representation.

Actually, these problems are both easy to solve - give them a visual representation. In the first case, make enter insert a new paragraph - that's nearly always what users want when they hit enter. Make shift-enter insert a br. Then provide an option to display makers that make it clear what type of line break was used. Note that these markers are not hidden markup tags in the document, the document model should never include hidden markup and it's quite simple to avoid this if you move away from thinking in terms of a DOM to thinking in terms of an attributed character array. The markers are provided just to make clear what the displayed whitespace is called for - most users will never need or want them, but they should be available just in case.

The  tag is even easier - give it a visual representation. We use a dashed blue underline and users seem to understand it immediately. Again, no hidden markup for the a tag anywhere, it exists only as attributes attached to the characters it wraps around. If it is an empty tag as in this example, display a glyph there so the user can see it. WYSIWYG is not and has never really been What You See Is EXACTLY What You Get, it's simply about making the display match the user's mental model of the document instead of making the user visualize the effects of arcane markup in their head.

These types of problems multiply when it comes time to edit. Consider the (probably) most widely-used HTML editor today: Microsoft Outlook.

Ugh. Let's not hold Outlook up as if it were even a reasonable attempt at an editor. It's probably the most awful editing experience you are ever likely to find. If you use Outlook, stick to plain text emails. Regarding the hyperlink complaint, that's most likely because Outlook automatically applies hyperlinks when you type an URL - this annoys a lot of people so they made it easy to remove the hyperlink again, by hitting backspace at the end of the hyperlink. This is just caused by the fact that Outlook's editor hasn't been carefully thought through and is just a bad example of a WYSIWYG editor - there's no reason it has to be like that.

An article at atpm.com describes some other limitations of WYSIWYG editors, including certain descriptive types of layout that can only be achieved by a powerful markup language. I’m sure it is still the case that the 15 year-old LaTeX can do things that state-of-the-art WYSIWYG page layout tools like Adobe InDesign cannot do. I think the situation is similar for XHTML editing; the advanced techniques described at a list apart (for example) aren’t going to be available in a visual editing environment any time soon.

It's true, hand coding HTML allows you to do more than WYSIWYG editors can. So what? The vast majority of users don't care. If you are editing a wiki page, a business document, a blog entry or pretty much any of the really common content creation tasks, you are concerned about content, not advanced layout techniques. The fact is that most users can do more with a WYSIWYG editor than with any markup language - the fact that technically minded people could do more doesn't matter, because the majority of people aren't technically minded and don't read A List Apart.

If you do however want to step outside the capabilities of the WYSIWYG editor, switch to the code tab and edit the HTML by hand. Outlook doesn't let you do it, but any half decent editor will.

So far I’ve just talked about the limitations of single-user visual editing. Remember this topic came up in the context of Wikis, where, like most multi-user authoring systems, differencing and merging are required operations. The types of problems mentioned previously become significantly more difficult when multiple users are editing at the same time.

Diffing HTML is hard, very, very hard. However, it's not the HTML that makes it hard - it's the fact that the content is generally natural language. I don't think I've ever seen a wiki that can do a decent diff of content - they don't understand the natural language to be able to determine what the intent of the changes were and display them appropriately. Amusingly the best diff ability I've seen is in a WYSIWYG editor - Microsoft Word. It has track changes so it knows exactly what you changed and how it happened so it displays the changes very accurately and doesn't lose the meaning of the changes in doing so. So if you want to improve the diff capabilities of your wiki, try an editor that will track changes to a document while it's being edited and forget about trying to diff after the fact. Your users will thank you for it.

I am typing this using the markdown syntax which I find very natural easy to read and write, but there are lots of others and not all of them are as elegant. Certainly more work is needed but I think the progress will be quicker in this area than with WYSIWYM editing tools.

Actually, even a very simplified syntax is going to turn users away. It's all about The Suck Threshold. If you force your users to learn a new syntax before they can use your wiki properly, they will feel incompetent and avoid it. If you give them a friendly user interface that they can immediately pick up and start creating great looking documents, you hoist them over the suck threshold and into the kick-ass range really quickly. Advanced users can still flick to the code view to do more, but everyone can get going without thinking about it. No learning time trumps a low learning curve. In fact, the original idea for wikis was to lower the barrier to entry - now the wiki syntax is the biggest barrier and it's simple to remove by putting in a good quality WYSIWYG editor. Just don't think you can put any old editor in and get the same benefit. It has to be reliable, intuitive and just do what the user means.

Clever text syntax rules only get you so far. Obviously the full range of possible XHTML tags can’t be easily, or intuitively, represented by text syntax. So it’s not a general solution to the problem of editing XHTML. However for fragments of XHTML embedded within a larger content management system, and for which a site-wide styling is provided, providing only a restricted subset of XHTML for authors is actually a good thing.

Apart from the fact that not being able to exercise the full power of XHTML was considered a weakness of WYSIWYG earlier in the article, most WYSIWYG editors that are designed to be embedded in content management systems allow you to select what is available to the user - you can remove everything except the CSS styles selector and force the user to just use your predefined CSS styles if you want. It's generally a lot easier to tailor to the specific needs of a system as well instead of designing a syntax language that includes what you need and nothing more.

Ransom note typography is rampant amongst the technical documents that I see on most days. By providing maximum flexibility the existing WYSIWYG tools are also providing no incentive for the authors to use the tools within the boundaries of readability, or good taste. Sometimes it’s unintentional, as when copy-n-pasting from other sources.

Fortunately most WYSIWYG editors have options about how to paste content - you can paste as plain text and strip out the formatting if you want. In fact, with editors designed to be embedded in content management systems (as opposed to desktop apps where what the user wants, the user gets), you can usually configure them to strip formatting when pasting (but preserve structural markup, or just paste as plain old text).

I’m sure there are places in the world for WYSIWYG editors. But not for documents that are intended to be delivered to the web first (as most should be by now). Hopefully when WYSIWYM editors become more mainstream we’ll see WYSIWYG die out except for use by professional typesetters. One can only hope.

The distinction between WYSIWYG and WYSIWYM is an awful lot smaller than you think. In fact, I can't think of a single example of strict WYSIWYG - every editor I know has some form of special marking that doesn't display in the final output. I think you'll find there is already a very strong movement towards using structural markup in WYSIWYG editors over just formatting operations and particularly in making it easier for users to use styles which is where the really big payoff comes. The problem however is not entirely with the tools, if user's don't understand the benefits of separating content from display, they aren't going to put in the effort to do it, no matter how small a hurdle we make it. The only chance then is to make it unavoidable and that requires mind reading so while we may get it right some or even most of the time, it's not going to be perfect. The trade off though is that users don't contribute the content so what's worse - content that is mixed with formatting or no content at all?

Getting Rid Of 5s Feels Good

May 11th, 2006

We've adopted a point based estimating system as part of rolling out XP, with 1 point being simple, 2 moderate, 3 difficult and 5 meaning too hard to estimate (either too big or too technically difficult). One large feature for the next release has been plagued with 5 ratings, both because the feature is big and hard to break down and because it's technically very difficult so we're not always sure parts are possible, let alone how long they'll take.

Today we reached a major milestone - we laid out all the critical stories that have to be done to make this bare minimum of the feature complete and managed to assign estimates to them that weren't just a bunch of 5s. Even better, a lot of them were given 0 ratings (too small to be a one but a number of 0s adds up to a 1) because we can leverage work we've already done. I think we wound up with about 34 points of work to do (once the 0s were grouped together into 1s).

The new problem is that we don't know what our velocity is. Velocity is deceptively difficult to work out because unless your estimates are consistent your velocity is worthless and we're just starting to get used to estimating this way. It's made worse by the fact that there are so many other projects that go on in our engineering team and thus change our velocity from week to week, but we're making progress in managing it all better and working out how long it will all take.

Regardless, it's really good to know that the feature is technically possible, will be quicker to complete than we initially thought and we have some idea of how much longer it will take - about 34 points worth to get the basics.

What Happened To The Delta Web?

May 9th, 2006

A while back I commented that I should look into the Delta Web project - I'm doing some work in this area now but nothing more seems to have happened. The mock schema for tracking deltas only works on an element level so is useless for describing changes to XHTML documents which is a shame.

Anyone know of any further progress or other related projects in this area?

Oops, forgot the direct link to the Delta Web proposal by Andy Roberts.