Mike Arrington notes that Google is walling off its news garden and keeping other services from spidering it.
Mike is right to point this out as hypocrisy. Google is making money off of other people’s work and wants to have some exclusivity.
Imagine if they did this with Blogger. To tell you the truth I’m very shocked Google hasn’t behaved like this earlier.
Here’s why walled gardens are important to companies (and why we hate them).
Let’s say we were going to develop a competitor to Google News or TechMeme.
Now, I’ve been reading quite a few feeds. Probably about 1/4 as many as the Google News algorithm, but enough to understand how to build a competitor.
There are a few stages to figuring out what is important news.
1) Parse the various parts of a post and put it into buckets. For instance, look at this item about GigaOm getting a new CEO. There’s the headline and the text. Then let’s look at the component parts. There’s who wrote it “CEO Smack”. There’s the subject matter “GigaOm” and “new COO.” Then there’s the body where you probably can find a variety of other important terms. “San Francisco” “Blog” “Om Malik” “Growth” “Paul Walborsky” “sales, operations, conferences” “Hercules Technology Growth Capital” and a link to TechCrunch.
2) Look at the inbound links. In this case there are none, but this is a ranking mechanism. More links means more important news.
3) Look at the comments. Are there any? How fast were they received? How many?
4) Use a human news judgment on the source of the news (is CEO Smack more or less important than, say, News.com, or TechCrunch?)
5) Look at how many times this story has been written about on blogs elsewhere in the past few hours.
6) Look for “news” verbs like “just released” or “new” or “beta” or “exclusive.”
7) Look for “news nouns” like “Microsoft” “Google” “Apple” “Facebook.” (Or whatever company you wanted to track).
Are there other things to study? Not many.
So, how do you get a better display of news than anyone else?
1) Get readers to vote. AKA Digg.
2) Get readers to add comments. AKA new Google stuff.
3) Track readers’ clicking behavior. If you know everyone is clicking on Paris Hilton stories instead of some new software for Facebook, wouldn’t that be valuable to you?
4) Get readers to send in their own news.
Are there many other ways to get a better display?
Now, how do you keep a better display? Easy. Wall it off. Keep your competitors from using the stuff that makes your display special. That way they’ll have to figure out a way to get it on their own rather than just spidering your display results and using that as a bootstrap to build upon.
Personally, the more I look at it, the more I understand what Google’s doing.
How about you?
The reason we hate them? Exactly because of that reason. We can’t bootstrap off of them and build something better.
Oh, and we don’t like it that companies are making profits off of our work. It +is+ our work that is building TechMeme and Google News, isn’t it? Yes. So why not share the profits back with the people who are helping make the system special?
Evil or not? That is the question.