09 Sep 2005

Search Engine Results Scraper Sites

Earlier today I was showing someone a bunch of new sites (1 week to 3 months old) and we were analyzing backlinks using Yahoo’s Linkdomain, and found the usual. Each site showed a few "real" backlinks .. as well as each site was showing a handful of pages with Yahoo Cache’s that were "Search Engine Results Scraper Sites" (with Big Ole Adsense on top(and side, and bottom)). None of these scraper pages had Google Cache. (I realize google has issues too, but ya gotta say they do much better than Yahoo. Yahoo has a heftier amount of dmoz scraper shit, and regirgated search results shit.

It’s so ironic that Google buys and sells advertising space on the shittiest of pages….and it’s part at the expense of Yahoo because Yahoo doesn’t do a very good job at filtering tons of this regergitated shit... These pages which Google allows adsense on, but won’t allow them into their index?? I can just see the engineers at Google laughing at Yahoo, "Let them find our advertising on your indexed pages of crap! We’ll even pay people to spam your results".

I guess though that it’s not Google’s fault if Yahoo’s eating these pages up and serving them….after all, Yahoo want the "biggest index" (20 billion pages??) which includes a few billion of adsense spam pages.

Like I said earlier, wouldn’t an easy filter of "kill any page with 3 dots at the end of each main paragraph"? (not that that’s going to solve all problems, but it would clean a lot at the moment).

Actually the reason for starting this post was that I just finished editing one of the sites we work which has a page of Testamonials for that company. Several of the paragraphs in the testamonials had been shortened with the use of "…" and I just changed them all to 2 dots instead of 3 in case any filters do come this way ;)

—– Tool of the day: We Build Pages Cool Cache Tool ——

I’m Feeling Lucky