Saturday, June 12, 2004

Microsoft Introduces new Robots.txt Commands

First there was GoogleGuy, then Tim, now MSNdude has made an appearance at WebMasterWorld Forums. After making an introductory comment about the politeness of their spiders to many users experiencing a pounding of their bandwidth from msnbot, MSNdude created his first post.



"I also want to make folks aware of a feature that MSNBot supports, but which is not yet documented. We do support what we call a crawl delay. Basically it allows you to specify via robots.txt an amount of time (in seconds) that MSNBot should wait before retrieving another page from that host. The syntax in your robots.txt file would look something like:



User-Agent: msnbot

Crawl-Delay: 20



This instructs MSNBot to wait 20 seconds before retrieving another page from that host. If you think that MSNBot is being a bit aggresive this is a way to have it slow down on your host while still making sure that your pages are indexed. "
My lord, talk about an opening post. Think about it, they are suggesting changing the robots.txt file to suit their bot. With Alexa for instance, you are able to pass certain website information by leaving a text file in the root of you domain for retrieval. Grub is the same way.

No comments:

Post a Comment