USCITY.NET has blocked questionable activity by Overture's AlltheWeb.com crawlers for not playing by the rules
"USCITY.NET has been crawled 24/7 for the past 3 months by the spiders of AlltheWeb.com, owned by Overture," said USCITY.NET's Mary Crawford. "Despite all of their interest in our site, USCITY.NET fails to show up in a search of their databases."
USCITY.NET has logged over 22 gigabytes of data pulled by AlltheWeb's crawlers attempting to harvest USCITY.NET's member links. AlltheWeb's dysfunctional crawlers also generated over 12,000 errors on USCITY.NET servers scavenging for pages that don't exist. Repeated attempts to contact AlltheWeb and Overture about its questionable crawling yielded no response.
"We've had enough," said Ms. Crawford. "We have no choice but to block AlltheWeb until it cleans up its act."
Read the full story at CBS Marketwatch
Along Come the Spiders
There seems to be an ongoing pattern for the now defunct spiders from the search engines that Yahoo has acquired. Even the FAST Enterprise crawler has been crawling into places it never had before. Although the FAST Enterpise spider has nothing to do with Yahoo, it's previous affiliation with AllTheWeb may still have some residual contracts to fulfill.
Just as in the AllTheWeb incident at USCITY.Net, the FAST Enterprise crawler harvested one page per minute (as per their crawler FAQ) for a solid day and a half at the Band of Gonzos website. Attempts to contact FAST were not responded to and the spider was blocked.
In a WebMasterWorld thread last month the question was raised about the FAST Enterprise crawler showing up in numerous logs. The thread confused the FAST Enterprise division of FAST Search with Yahoo's acquisition of AllTheWeb and it's spider which was controlled by FAST Search.
Tim Mayer of Yahoo responded by saying, "You are confusing the crawlers from the old web/ATW division of fast which is now owned by Overture/Yahoo and the enterprise Division which is still a Norwegian Company. They are two different companies with different crawlers and end products." In the FAST Coorporate FAQ though they list Overture as a "key customer". They do not call it the web for nothing and Mayer's usual responses never confirm or deny anything, they tend to leave you with more questions than answers.
It is deeply concerning that there is a rise in spider activity from companies that no longer exist (on paper) or from companies that would not normally crawl a site that has not paid for the inclusion. Asking or trying to find verification of who owns the errant spiders and what purpose they are for is tough. Trying to find the information on the FAST crawler was easy enough to do last month, but now the page for it has disappeared.
Read and discuss this story at Band Of Gonzos Forum
No comments:
Post a Comment