Thursday, February 14, 2008

Crawler Improvements for Live Search announced

Microsoft announced several improvements to their Live Search crawler making it more efficient by reduced use of bandwidth resources during the crawl and indexing of a site.

HTTP Compression: HTTP compression allows faster transmission time by compressing static files and application responses, reducing network load between your servers and our crawler.

Conditional Get: We support conditional get as defined by RFC 2616 (Section 14.25), generally we will not download the page unless it has changed since the last time we crawled it. As per the standard, our crawler will include the "If-Modified-Since" header & time of last download in the GET request and when available, our crawler will include the "If-None-Match" header and the ETag value in the GET request. If the content hasn't changed the web server will respond with a 304 HTTP response.

Also upgraded the user agent to reflect the changes, it is now "msnbot/1.1".
The post Announcing Crawler Improvements for Live Searchon the Live Search Webmaster Center blog gives more details.

No comments: