At the start of April, a Reuters news article came out which quoted Bob Visse, director of Marketing for MSN:
We do view Google more and more as a competitor. We believe that we can provide consumers with a better product and a better user experience.
Sounds ominous doesn’t it? Many people expected Microsoft to therefore create their own search engine (instead of just using Looksmart, Inktomi and Direct Hit), but it seems things have happened a bit quicker than expected!
Yep, a few people have noticed a new robot or crawler indexing the internet and all signs point back to Microsoft at the moment.
Whilst it hasn’t yet hit my blog, I have been hit by it on one of my other sites with the following details:
131.107.163.49 – – [20/Apr/2003:12:54:56 +0100] “GET /robots.txt HTTP/1.1” 200 763 “-” “MicrosoftPrototypeCrawler (please report obnoxious behaviour to newbiecrawler@hotmail.com)”
The IP address 131.107.163.49 falls within the 131.107.0.0-131.107.255.255 (in otherwords a 131.107.0.0/16) netblock which is allocated to a certain Microsoft Corp of One Microsoft Way, Redmond, Washington, 98052, USA.
Using that information, I was then able to look at the logs again and saw quite a few page requests (I stopped counting after the 200th request made in the first 9 hours of today) from the IP address 131.107.65.225 (also owned by Microsoft) with the “Browser User-Agent” of “Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+.NET+CLR+1.1.4322)”.
So, it would appear Microsoft has launched a new spider/robot out on the Internet and its name is MicrosoftPrototypeCrawler, but Microsoft want to keep it slightly quiet for now by mostly hiding the user-agent string (which states what sort of computer and web browser you are using) as being Microsoft Internet Explorer 6 on Windows NT 5.2 (Windows XP claims to be Windows NT 5.1, so I would guess the new crawler is pretending to be on Windows .NET or 2003).
If the results from the crawler will be made public or not (or if they are just for internal Microsoft development for some reason), or what affect it’ll have on the Internet and the way people search – especially considering that according to Alexa Research, MSN.com is the 2nd most popular site world wide (Google is only 5th). But I’m wondering why MSN/Microsoft is so concerned about trying to semi-hide the crawler for now and why they are using a @hotmail.com address instead of a @microsoft.com one (the former doesn’t really give a lot of “respect” on the internet due to the fact anybody can get them for free).