The Problem With OpenID

August 17th, 2007

Flow|State has an excellent semi-rant about how poor the user experience is when using OpenID - both signing up and logging in. In particularly the question of what happens to all your accounts when your OpenID provider disappears is a particularly good one.

It so happens that I was looking into this just today since I needed a user friendly but secure authentication mechanism. OpenID seemed like a natural choice since I was effectively starting from scratch anyway, why not use a standard? The main problem I had with OpenID didn't really come through clearly in the Flow|State article though - OpenID requires users to log in twice. The first log in requires them to enter a URI, the second log in requires them to enter their password (or in some cases a username and a password). It's bad enough that most URIs are much longer than most usernames or even email addresses, but there's actually a page reload between the URi and the password. When was the last time you saw a webapp display the username and passwords fields on separate pages?

You can do a lot to make things easier on users of OpenID, but the whole protocol is based around the idea of having an extra page load between the URI and the password. You can avoid the problem sometimes if the OpenID provider's session hasn't expired yet but you can't have those sessions hanging around too long or they become a security hole.

So in the end, the best thing we can do is to ignore OpenID and use the old fashioned username and password per site. If we want to have a consistent online identity, provide a way to set your homepage URL and link to it where ever the username appears. You can still impersonate people but OpenID doesn't really stop that anyway - worst case, use OpenID just to verify that the person's homepage is really theirs.

Link Incest

August 17th, 2007

I have to agree with "Jon", Gruber (John with a h) does have a habit of linking to sites that link to him regardless of how good they are. For instance, this link list entry was a complete waste of time…

I wonder if there's any limit to it?

Solr Is Cool

August 17th, 2007

I've struggled with Lucene before and failed to configure it properly resulting in absolutely horrendous search results so a recent need to implement search functionality wasn't something I particularly wanted to take on. In fact, I was prepared to do almost anything to avoid delving back into Lucene filters and parsers and tokenizers and "stuff". This tends to be problematic given that Lucene is the dominate search library - so popular in fact that it's been ported to other languages.

So I took a look at Solr - a web services front end to Lucene. Exchanging Lucene APIs for HTTP requests seems like a good tradeoff for me and Solr comes with a pretty decent configuration for Lucene right out of the box.

As it turns out, Solr's default configuration isn't just pretty decent, it's also surprisingly well commented. Combined with some reasonable documentation it was pretty straight forward to get Solr to do what I want and provide good search results without much effort. With a bit more effort I should be able to get search highlighting working as well which takes search results to a whole new level.

Two things really made me appreciate the choice to use Solr:

  1. It has a DirectSolrConnection class that removes the need for actual HTTP requests. As a bonus, it still uses the HTTP URLs and returns the same responses so if you later need to split Solr out onto it's own server you just have to implement the HTTP stuff and not change the result of your processing.
  2. It caches searches automatically.

Caching is just so cool to see in action. Using the built in search from Jackrabbit (which also uses Lucene) it was too slow to include the output of a search with each page (think, related links etc). With Solr's caching this isn't a problem anymore.

There's still a bunch of learning to do so that I can get really optimal results - getting searches to work over multiple fields properly so that I can weight the results based on which field matched would be good and I can see Solr can do it - just not quite sure how to make it all happen. Still, the current search is way better than anything I've managed to do before. Thanks to the Solr and Lucene teams!