Google and Technorati blog search rewards sploggers
I find it ironic that in a search of who is linking to Mashable’s report about StumbleUpon that a splogger is there using MY blog post (a splogger is someone who uses a system to automatically copy blog posts from other people). But notice that neither Google nor Technorati have my own blog post displayed right now. Here, take a look. Technorati at 12:18 a.m. (Pacific Time) has Naik’s News. But not me. That’s a splog and is copying my content. I don’t care about that. But I do care that blog search engines reward jerks like this.
Google’s blog search does it too. Naik is there WITH MY FREAKING POST but I’m not.
Ask gets it right. Yeah Ask!
Why hasn’t Technorati fixed their splogging problems yet? Can’t they see that duplicate content is coming off of my blog? And, Google, come on. It’s amazing that with all your PhD’s over there that you can’t do a better job with blogs than you’ve done so far (I keep hearing that the engineers at Google are bored with blogs and RSS and stuff like that and would rather work on video compression systems or something “sexy.” That attitude will get me to switch to your competitors, especially when they do a dramatically better job like Ask is doing).
Please Matt Cutts, get rid of Naik. Same for you Dave Sifry. Thanks.

Powered By
December 14th, 2006 at 2:09 am
Looks like technorati will get sued in thenear future over the pigeonrank infringment?
December 14th, 2006 at 2:20 am
I’m a little surprised. I thought this is what technologies such as Google Page Rank were supposed to resolve.
Personally, I take a lot of time to go out and take photos - conduct interviews and capture video. It takes a long time to do! I care if my work is acknowledged !
December 14th, 2006 at 2:25 am
Ah, I’m very familiar with Naik. What’s interesting about Naik is he has a self portrait icon in his Technorati profile. And he’s smiling!
December 14th, 2006 at 2:34 am
I use blog search engines heavily to find content to put up on my blog, and Technorati has got to be one of the best. Bloglines and Blogdigger are my top two.
Sphere is GRADE A shit
Google blog search is good too, not a lot of spam, but not a lot of content either, sigh.
December 14th, 2006 at 2:43 am
Maybe you should try this cloaking technique for sploggers. Worked great for me!
Here are two useful links (especially check the second one):
http://www.plagiarismtoday.com/?p=287
http://seoblackhat.com/2006/07/14/ip-delivery-to-stop-rss-content-thieves/
December 14th, 2006 at 3:10 am
Even this post has made it to Naik’s site. And is listed top at technorati.
December 14th, 2006 at 3:10 am
ooopsss, probably matt cutts and the guys have to tweak page rank with algorithms to take care of plagiarism…speaking of the next generation search engines, probably this is what we need…
December 14th, 2006 at 3:54 am
It might be because Naik is pinging those services on every rss grab, thats why his content is listed on those services before yours is crawled.
December 14th, 2006 at 4:00 am
[...] Rick Hurst: am I a splogger? By Yoda A new term for my web buzzword vocabulary today: “splogger” - a term I just saw on scobelizer, apparently used for “someone who uses a system to automatically copy blog posts from other people”. I am doing something similar with the “skatevine” page on dfr skate zine, but I prefer the term “news aggregator”. I’m not profitting from this as I don’t currently carry advertising on DFR - my reasons for doing it were basically:- [...]
December 14th, 2006 at 5:31 am
I try not to just copy peoples content outright but to incude maybe a one or two sentence quate and a linke back to their post when referencing someone elses blog. Firgure should be sending the traffic their way.
December 14th, 2006 at 5:41 am
Why not do an article on what is considered fair use for archiving, personal use, offline reading, etc etc. instead of pissing and moaning about someone copying content.
Better yet come up with a creative proactive solution to keep it from happening in the first place; then, only a creative hacker could grab your content.
Just a thought… with all those six figure phd’S running around. Put the creative juices to work on creative fix’s.
Please read comment through a softening filter before responding.
December 14th, 2006 at 5:47 am
Robert,
Hope you will be able to make it out to Demo at the end of January. We launch Iwerx and Sentinel there. Sentinel would monitor your content and pin-point the sploggers for you.
Now, if we could get some weight with Technorati and Google to have them use our blacklist via API to prune their lists, life would be good.
December 14th, 2006 at 5:59 am
What Would Niall Kennedy do? :)
In all seriousness though… Have you phoned him or contacted him in some way (He’s studying in Orlando according to his own profile page)? I’m sure you’ve rec’d (or know someone who has rec’d) a cease and desist. Serve him with one of those if you’re feeling particularly strong about this to defend your material. (The bad precedent you might be setting is a lack of defending your copyright. Maybe you’ve already done all this and court’s not particularly high on your to do list. Dunno?).
Looks like he’s grubbing a lot of other content too from Engadget, etc. What’s the temperature of the other sources he’s pilching from? This topic might make a good future post to flesh out some issues around.
December 14th, 2006 at 6:11 am
“I keep hearing that the engineers at Google are bored with blogs and RSS and stuff like that and would rather work on video compression systems or something “sexy.” ”
I hope this isn’t true, because I think this is what has lead to most of Microsoft’s problems.
December 14th, 2006 at 6:12 am
I have been wondering when people will realize that Google’s ranking system leaves much to be desired.
For example, try a search for “Michigan” or “Michigan news”. Now tell me that the site at AbsoluteMichigan.com doesn’t match either of those searches a whole lot better than 7 different University of Michigan departments or the defunct for 1 year Michigan Indymedia Collective.
Don’t get me wrong, Google is still very good. It is (however) subject to being gamed by sploggers and favors old sites over newer sites.
December 14th, 2006 at 6:45 am
One way to look at this is that his site is no different than google reader.
Or your own link blog via google reader.
December 14th, 2006 at 7:34 am
I know how you feel, but we live in a Web 2.0 world now and most content on the Internet is stolen. Most technology blogs are just Digg copied content reproducers.
December 14th, 2006 at 8:12 am
@Ross
google reader doesn’t ping other services with the RSS content though. Just as well really or you’d get some pretty nasty recursion. :).
Pinging update services with an exact copy of an existing post is pretty pointless, it’s just duplication.
December 14th, 2006 at 8:18 am
[...] Robert Scoble is up in arms about how some of this is handled. How is this splogger monetizing his traffic? Adsesne does not appear present which can be a root cause. [...]
December 14th, 2006 at 9:45 am
The problem isn’t the different search engines, so much as the advertising revenue which usually propels the creation of such sites. Search phrase “google must shape up” provides more info:
http://www.google.com/search?q=%22google+must+shape+up%22
This Naik’s site doesn’t get direct ad revenue, but instead uses a blogroll to boost the pagerank of a chain of auto websites. (”Blogroll” was a technique used by early A-listers to boost the traffic of friends, regardless of any currently useful content they might offer — it would have been good if spiders had filtered out such manipulations instead of rewarding them.)
Removing the bad incentives seems a surer cure than patching the various search engines to handle each new hack as it arises.
There’s another problem in this Naik’s case: the hotlinking of images back to the source site. Why should Gizmodo pay this guy’s bandwidth costs? (For the record, Niall Kennedy made me laugh out loud last week.)
jd
December 14th, 2006 at 9:57 am
Interesting that a lot of you don’t understand the difference between Google’s main search engine (at google.com) which uses PageRank and Google’s blog search engine (at blogsearch.google.com) which does NOT use Pagerank. Blogsearch is about speed. Web search is about relevancy.
The thing is that it’s weird that blog search engines see a copier before they see the original.
Anyway, I really don’t care to chase people like Naik all over the Internet. I’m sure he’ll see my comment here if he cares about his reputation.
December 14th, 2006 at 10:05 am
I write a blog for LockerGnome and this happens quite a bit to me, even with a fairly new blog there. It is annoying, not about being copied so much as giving them revenue or whatever it is anyone thinks they get from Adsense. I’ve been reading that sites which are splogs and have no real content other than Adsense will be banned from some web directories. I think this is great. They should be banned, they offer no original content.
December 14th, 2006 at 12:03 pm
Hey Robert: I caught one guy using the wordpress platform splogging my posts and collecting adsense dollars(one disugsting splog was about the recent death of my mother). I joined the forum at WP about the same splogger; their techs say there is really nothing they can do when someone uses their platform–but certainly they could make an agreement with WP users! WP don’t like it either–read the vitriol. The WP forum on this is fervent. WP suggested to go after their hosts and stated that Google does not give a c**p! A string of code that catches sploggers would be smart. Hello Google PhDs! It’s copyright infringement at its best and disgusting when the splogger is collecting ad revenue from venerable web behemoths like Google AdSense for it. Google is effectively geting revenue for stolen content; it must police those who sign-up at AdSense better and bind them up front. Then if they break the copyright law they can pull their advertising agreement. That should kill sploggers’ motivation. WP also suggest victims pursue the advertisers. That’ll get Google’s attention!
December 14th, 2006 at 12:13 pm
Passed this on one to the blogsearch folks.
December 14th, 2006 at 12:57 pm
I’ve been pushing the word ’sprophylactic’ to describe the kind of measures that search engines could take: http://www.reevoo.com/blogs/bengriffiths/2006/12/07/sprophylactic/
December 14th, 2006 at 4:03 pm
I wrote about this back in September… and even got to chat with someone (wish I could remember who) from Technorati. They could easily tie in to one of the Spog reporting sites:
http://www.douglaskarr.com/2006/09/28/splog-your-way-to-a-higher-technorati-rank/
Perhaps Technorati should only rank sites based on connections from other ‘claimed’ blogs on Technorati.
Doug
December 14th, 2006 at 9:51 pm
Thanks Matt! I owe you a nice steak dinner.
December 14th, 2006 at 10:36 pm
That site should now be gone from your future results. This is constant battle between freshness and relevance, something that we’re working on on a daily basis. Thanks for the report.
Dave
December 14th, 2006 at 11:37 pm
By the way, are you sure that you have pinging set up correctly Robert? We aren’t seeing any pings from you via scobleizer.com or via scobleizer.wordpress.com…
Dave
December 15th, 2006 at 12:55 am
David: I don’t know. I’ll check into it. Thank you so much for removing Naik!
December 15th, 2006 at 1:42 am
Mr. David Sifry,
Can you point to a full explanation of “pinging set up correctly?”
Thanks.
December 15th, 2006 at 11:39 pm
Didn’t Google also just finish dancing? Does that make a difference on the Google side of this irksome search results issue? I just wish shaunblog.com would not come up on Google at #2 in a search for Shawn Honnick!
February 28th, 2008 at 7:01 am
“That attitude will get me to switch to your competitors”
this is damn funny ..how would that help you and why
would google care if your small blog is in their index or not? thinking of that google takes like 70-80% of the searchengine market, good luck going to competitors..