Google Spam / Content Farm Filter

Submitted by tomo on January 21, 2011 - 3:06pm

There's been a lot of talk about the decrease in quality of Google search results over the years due to spammers / content farms with strong SEO skills. I'm glad I'm not the one who's been annoyed by this.

Google should know which sites are spam, content farms, or duplicated content. That they aren't properly filtering or demoting them could be due to a conflict of interest - they make money from the ads on those crap sites.

But we, as individuals, can easily distinguish the spam results from the quality ones and we do so everyday. If only there were a way to stop duplicating this effort.

If Google won't do this for us, then we can do this ourselves.

Here's what I want:
1. When I've been tricked into opening an ad-filled page without meaningful content, I want to go back to Google and mark that link as "spam", have that noted somewhere in the cloud so I can access it from any computer, and have future search queries filter out that link.

2. I probably don't want to see any pages from that domain show up on any other queries.

3. I probably don't want to see any pages that my friends have also marked as spam.

4. I probably don't want to see any pages that friends of my friends have also marked as spam.

5. I may even want to befriend / "follow" strangers just because they're good at marking spam.

In spite of all the talk about the social graph and Facebook search undermining Google, I think that using one's social graph for negative search is more useful. It's more likely that your friends will mark the few content farms they run into rather than "like" every useful page on the Internet.

When I first wanted this, I couldn't find anything doing what I wanted. Google's discontinued SearchWiki would have been a good start. Now with all the recent discussion I know there's a Chrome plugin that will let you filter based on a blacklist that you manage. (Search Engine Blacklist for Chrome - https://chrome.google.com/extensions/detail/jiicbcimbjppjbckmoknagndlhjbeohb)

What's still missing is the collaborative / social aspect of the blacklist. This could be built independently of the existing plugin and still be useful, I think.

As an experiment, I've created this Google Docs spreadsheet that anyone can access and edit. A future step could be to hack the SE Blacklist plugin to add to this spreadsheet...

https://spreadsheets0.google.com/ccc?key=t3savC7CU4UDUsRbglF6UjA&hl=en#gid=0

I will slowly add to this list, then see about integrating it with browser plugins, and eventually Facebook.

Read the rest of this article...
admirza12 (not verified)

I hope you have a nice day! Very good article, well written and very thought out. I am looking forward to reading more of your posts in the future
casino games

tomo

Thanks Andreas. I saw the list on Techcrunch and added it to the spreadsheet.

acmelab68 (not verified)

Hi,
nice try ;-) no - seriously! I'm just too dumb you paste into your list. Here is what blekko.com already filter for you:

  • ehow. com
  • experts-exchange. com
  • naymz. com
  • activehotels. com
  • robtex. com
  • encyclopedia. com
  • fixya. com
  • chacha. com
  • 123people. com
  • download3k. com
  • petitionspot. com
  • thefreedictionary. com
  • networkedblogs. com
  • buzzillions. com
  • shopwiki. com
  • wowxos. com
  • answerbag. com
  • allexperts. com
  • freewebs. com
  • copygator. com

See also the link I've posted, it's also heavily related.
Thanks for your work and please keep this up.

Regards
Andreas

Lion Lotek (not verified)

I Like it Tomo, make it happen!

© 2010-2014 Saigonist.