Friday, September 3, 2004

Yahoo Black Hole: From Slurp to the Yahoo Index

I just did a review of the August logs at the Band of Gonzo Forums in regards to the Yahoo Slurp spider and comparing those results with actual pages that made it into the search index. What I found was absolutely disgusting and I could not believe my eyes.

Last month the Slurp spider crawled a total of 1196 pages in total. Out of all those pages, only 17 pages made it into the Index at Yahoo Search. Of those 17 pages, only one was an actual forum topic page.

Logs of other sites that I monitor are showing similar results. A whole lot of crawling, but the pages disappear into a black hole at Yahoo. Some pages get crawled more than one time, yet they still do not make it into the search index.

According to Yahoo's information on inclusion into the index:

When Yahoo! Slurp crawls pages from your site, the pages are not instantly put into the Yahoo! search index. Once crawled, the documents will be considered for inclusion at the next database update. Pages that are indexed will be able to be seen at the conclusion of the update process.
What in the hell does "considered" mean? I know an update just occured, because I can see some pages that they were awfully kind enough to "consider". Of course I answered "no" to the question "Is this enough information?"

Some pages can bear the delay (if at all) in getting into Yahoo's search index. But some pages cannot. Time sensitive pages such as information on current issues (SP2 troubleshooting, virus removal, etc.) need to be fresh in the search index -- the delay makes entry into the index a moot point after 3 or 4 weeks.

I also deal with several commercial sites. New products and their entry into the index involve intense competition in the first few weeks of sales. If you are in there on Day 0 and nobody else is --- well you know where that can hurt you. After the media blitz on the product dies down, then you may be lucky to have one page in the index -- again, what is the point? The traffic is no longer there and you lost out on a lion's share of the blitz.

This raises the question about relevancy at Yahoo Search. It also questions the freshness and the credibility of search as well. What do they deem to be "considerable"? I have read the guidelines at Yahoo and as far as I know, I have not broken any of the rules and those pages go above an beyond their quality requirements. Yet it seems that doorway pages seem to get a free pass thru the black hole and go directly into the index.

I know this is not just a problem here, but all the forums are abuzz with the same stories. Yahoo is broken and they are not moving to fix the situation all too quickly.

I am keeping an eye on the sites in my sector to see if there is a pattern, you can bet your money on that. So far, I have not seen one except that it appears that spam makes it into the index quicker than legit pages do.

Am I suggesting spamming the Yahoo Index? Not yet. I will tell you one thing though, if that is what you have to do to get ahead at Yahoo Search -- you can fill in the rest here. It seems that if you cannot beat them -- join them.

My view on this does not reflect the same views at the BoG Forums. But if you want to chime in on this, you can do so here.

No comments:

Post a Comment