Symphonious

Living in a state of accord.

How Much Bandwidth Do Search Engines Take Up?

There are an awful lot of search engines out there and they all try to index as much of the web as they can, as quickly as they can.  For this site, search engines seem to cause more traffic than anything else:

Top 20 of 720 Total User Agents
#HitsUser Agent
1686612.63%msnbot/1.0 (+http://search.msn.com/msnbot.htm)
2560010.30%Googlebot/2.1 (+http://www.google.com/bot.html)
335466.52%Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/
426334.84%Mozilla/2.0 (compatible; Ask Jeeves/Teoma; +http://sp.ask.com
517063.14%sna-0.0.1 mikemuzio@msn.com
611202.06%Mozilla/5.0 (compatible; BecomeBot/2.3; MSIE 6.0 compatible;
711032.03%NetNewsWire/2.0 (Mac OS X; http://ranchero.com/netnewswire/)
88061.48%Krell-GeoScraper/0.1 libwww-perl/5.79
97541.39%aipbot/1.0 (aipbot; http://www.aipbot.com; aipbot@aipbot.com)
107501.38%Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
117461.37%Planet HUMBUG +http://planet.humbug.org.au/ Planet/1.0~pre1 +
127451.37%Planet Linux Australia http://planet.linux.org.au Planet/0.2
137441.37%Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET
147261.34%Planet Apache +Unconfigured Planet Planet/1.0~pre1 +http://ww
157021.29%NewsGatorOnline/2.0 (http://www.newsgator.com; 1 subscribers)
166731.24%Mozilla/4.0 compatible ZyBorg/1.0 (wn-14.zyborg@looksmart.net
176551.21%NetNewsWire/2.0b45 (Mac OS X; http://ranchero.com/netnewswire
186481.19%Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gec
196211.14%NewsGator/2.0 (http://www.newsgator.com; Microsoft Windows NT
206011.11%Mozilla/5.0 (compatible; BecomeBot/1.86; MSIE 6.0 compatible;

Now admittedly a lot of those hits will result in Not Modified responses but still, when you expand this to every site on the internet, that’s a lot of HTTP requests being fired around.

It also shows the power of syndication – almost no one actually reads this site directly (except the bots), yet lots of people link to, comment on or mention my posts (I’m certainly no a, b or even c list blogger though).  Keeps the bandwidth requirements down and still get the message out – can’t complain about that.

  • Iain says:

    Similar statistics on my site (funny, that; it’s the same server ;-) ). Incidentally some of the hits (I’m looking at you, Yahoo Slurp!) are for the same documents over and over – and, largely in my case, they return 404 for a damned good reason, i.e. they don’t exist.

    June 6, 2005 at 8:30 pm

Your email address will not be published. Required fields are marked *

*