It soon became clear that John's Sitemap was part of the problem. It was in conflict with his robots.txt file. Skitzzo of SEO Refugee discovered the differences between a cached version (saved here also) of the file and the now drastically altered version.
Examination of the old robot file disallowed Googlebot for his archived monthlies, feeds, trackbacks, files ending with extensions of .php and .xhtml and any pages with a question mark (?).
What prompted the change in the first place? Jez found another article of John's on how to get your pages out of Google's supplemental index. At the time of the post, he had 1,790 supplemental results. After a robot file tweak, he has managed to remove 10 of those pages from the index. Good job John!
More importanty, his robots.txt tweak had another nasty side effect. Not only were pages being removed from the supplemental index, he was losing regular indexed pages as well. John had 3,190 pages in the index total. His robots.txt file effectively wiped out 340 (non-supplemental) pages and is now down to 2,840. Excellent job John!
But, John does not mention the robots.txt change in his post. Nor is there an update to the supplemental index post. Instead, he is trying to milk his secret for everything it is worth. And in John's case, he is looking for more money.
I was going to use this post to explain exactly what I did to restore my number one ranking. However, after reading Kumiko’s comments in my Taipei 101 to number 1 post, I’ve decided against it. I think everyone will agree that this kind of information is extremely valuable - some “SEO Guru” tried to take me for $4,000 by saying he knew the answer (which I highly doubt since he made no guarantee).Whether this change in the robot file was the reason for John's return to number one or if it was just the Google update process taking few days to settle down is not an issue. People will probably be debating this for weeks to come.
What is an issue is that John seems to think that he is onto something, I genuinely believe that. But I also know that John knows of how dangerous his supplemental index post is and is afraid to admit it. Meanwhile that supplemental post is wrecking Google results for everyone who hangs on John's every word -- John is not only evil, he is an egotistical bastard who obviously does not care about his readership. Grade-A job John!
Thanks for the mention. However, I have to disagree with you pretty strongly on one point. John's post about changing the robots.txt is not at all dangerous. Not only is changing your robots file ok, it's probably a good idea. However, you have to be careful what you eliminate. Also, just because he dropped the number of indexed pages, that does not mean it's a bad thing. However, when you start dropping pages that are RANKING and bringing you traffic, that's when it's a problem. In this case John managed to drop his index page which is pretty much the worst case scenario.
ReplyDelete"In this case John managed to drop his index page which is pretty much the worst case scenario."
ReplyDeleteSo, that is dangerous then? And also a good point. How many of his readers do you think dropped their index pages as well?
Dropping over 10% of your non-supplemental indexed pages is not necessarily bad? Since when? I don't think you believe that for one moment.
I don't know skitzzo, John still has not changed the supplemental post or recanted with another. And there are people out there who just wanna be like John. Sounds dangerous to me.
Hm... it all sounds pretty shady to me. And why not share such valuable information with readers? Am I missing something here? I thought blogging was about sharing knowledge and resources so that others may benefit. It's the whole "get rich helping others get rich" thing, the premise of which is what helped J. Paul Getty earn his fortune beyond what he was already born into.
ReplyDeleteOh drat! I disallow the googlebot from my monthy index files at my knitting site. I did that because I used to post zillions of "how to's" and I wanted google to send people to the specific article they are actually seeking.
ReplyDeleteIs this sort of thing losing me PR?
Yes you are losing PR by not allowing googlebot to index your internal pages. Every 'internal' link will be evaluated by the bot that crawls it.
ReplyDeleteIgnore the fact that it is supplemental. The reality is that it is in the Index. Losing the potential of hitting on search term is bad. Blocking Googlebot from indexing any of your pages is just bad practice.