We need better statistics…

Almost every entrepreneur I talk to lately whines privately about the stats they see on places like Compete.com, Comscore, and Alexa. Today Tom Conrad of Pandora told me that they are extremely low. He says his service requires registration, so he has very accurate stats of who’s signed into Pandora and he can’t figure out why the stats services are so far off of the real stats.

Marshall Sponder, over on Web Metrics Guru, looks into Comscore’s stats of Second Life’s users and finds the same problem.

The thing is these services rely on toolbars (I can’t even use any of the toolbars on the Macintosh for some reason, and how many of you even have one of these folks’ toolbars loaded? None of my friends do and I’ve been checking). Or they rely on “panels” of Web users that they survey regularly. Do you know the selection mechanisms? How do they know they are getting a representative sample? Clearly very few people who run Web companies find their stats accurate. Yet we’re supposed to believe in them?

Also, there are lots of sites who seem to have more traffic than, say, my blog, but they get less comments on every post and if we both link to someone new, the new site gets a lot more traffic from me. I have such a site in mind, but I don’t want to get into an argument with that site. Translation: the engagement levels on some blogs are quite different, but advertisers are being sold on these stats companies and on pure “uniques.”

I don’t know what the solution is, though. What stats do you think are the most important? What’s the most accurate way to measure your sites’ visitors? What will advertisers insist on seeing in the future?

Oh, and in the future people aren’t going to visit your page at all. Most of PodTech’s traffic comes from its embeddable gadget. So, are you visiting a blog that has our gadget embedded when you watch one of my videos or are you visiting PodTech? I bet most normal people will answer “a blog.” That’d mean that PodTech’s traffic will get way underrepresented in these services (which matches what we’re seeing in our server logs when we compare our real traffic with what Alexa/Compete/Comscore are telling us).

Comments

  1. Absolutely agree — the stats services don’t even agree to each other, and neither do any analytics scripts embedded on webpages.

    I personally think a way of measuring blogs (or perhaps most all services) should be how many “repeat” customers you get. Exactly the opposite of uniques then.

    That is how all service businesses like restaurants, training centers and even seminars are measured and I dont see why it shouldn’t be a way of measuring blogs.

    What is the churn rate of the readers VS the uniques (not uniques by themselves). What is the average stay length per visit. Most importantly, how many of those people consistently over a month and actually participate in creating value for the service.

    For blogs, thats with user comments; for wikipedia and other communities, its in extensions.

  2. Absolutely agree — the stats services don’t even agree to each other, and neither do any analytics scripts embedded on webpages.

    I personally think a way of measuring blogs (or perhaps most all services) should be how many “repeat” customers you get. Exactly the opposite of uniques then.

    That is how all service businesses like restaurants, training centers and even seminars are measured and I dont see why it shouldn’t be a way of measuring blogs.

    What is the churn rate of the readers VS the uniques (not uniques by themselves). What is the average stay length per visit. Most importantly, how many of those people consistently over a month and actually participate in creating value for the service.

    For blogs, thats with user comments; for wikipedia and other communities, its in extensions.

  3. It’s not just startups that are hurting though. I have yet to see anyone that owns/runs/manages a website happy with the stats on 3rd party sites.

    The only solution is ISP-side tracking (which is outrageous), opt-in programs for webmasters (via JS), or OS-level trackers.

    The most realistic option is a company that does hosted webstats (like FeedBurner, Google, Performancing, etc.) publish their results.

  4. It’s not just startups that are hurting though. I have yet to see anyone that owns/runs/manages a website happy with the stats on 3rd party sites.

    The only solution is ISP-side tracking (which is outrageous), opt-in programs for webmasters (via JS), or OS-level trackers.

    The most realistic option is a company that does hosted webstats (like FeedBurner, Google, Performancing, etc.) publish their results.

  5. It also allows for heavy cheating. Yeah, you can cheat, who would have known. I never did but I considered making an app that would allow sites to boost their scores, by submitting hits directly to the sites across a subnet via a bot.

    Maybe I should code that app after all.

    All it takes is Ethereal and some socket programming. You can even have the bot sign up accounts. I know for a fact some of the sites on the top 100 have homebrew versions of these applications and use them.

  6. It also allows for heavy cheating. Yeah, you can cheat, who would have known. I never did but I considered making an app that would allow sites to boost their scores, by submitting hits directly to the sites across a subnet via a bot.

    Maybe I should code that app after all.

    All it takes is Ethereal and some socket programming. You can even have the bot sign up accounts. I know for a fact some of the sites on the top 100 have homebrew versions of these applications and use them.

  7. Value on the internet is rarely about reality, it’s about perceived reality, and those who can push that win online.

    A quick Google search reveals there are already many php versions available for Alexa:
    http://onlinehoster.com/blog/alexa-rank-cheater/
    http://www.dailysofts.com/program/703/37178/Alexa_Ranking_Booster.html

    As for netcraft and the others, that’s more complex, because they limit their “hit ratings” to limited user groups. Hard, but if you have money, still not impossible. At any rate, the cheaper ones like Amazon’s Alexa and others are quite easy to emulate through a proxy server list or even a subnet you may have access to. Larger sites have large subnets, plus they can use the proxies as well.
    I would suggest you cheat, because all the other sites are doing it anyway.

  8. Value on the internet is rarely about reality, it’s about perceived reality, and those who can push that win online.

    A quick Google search reveals there are already many php versions available for Alexa:
    http://onlinehoster.com/blog/alexa-rank-cheater/
    http://www.dailysofts.com/program/703/37178/Alexa_Ranking_Booster.html

    As for netcraft and the others, that’s more complex, because they limit their “hit ratings” to limited user groups. Hard, but if you have money, still not impossible. At any rate, the cheaper ones like Amazon’s Alexa and others are quite easy to emulate through a proxy server list or even a subnet you may have access to. Larger sites have large subnets, plus they can use the proxies as well.
    I would suggest you cheat, because all the other sites are doing it anyway.

  9. Not to sound naive, but at the end of the day traffic is less important than conversions/sales, at least for those of us with e-commerce web sites. For ad-supported sites, it seems like the site’s own server stats should suffice. Measure and publish the actual stats, not a bunch of imperfect proxies. Sort of like in politics: the opinion polls stop mattering once the actual vote is held. Maybe there is room for a 3rd-party audit service that verifies self-published stats.

  10. Not to sound naive, but at the end of the day traffic is less important than conversions/sales, at least for those of us with e-commerce web sites. For ad-supported sites, it seems like the site’s own server stats should suffice. Measure and publish the actual stats, not a bunch of imperfect proxies. Sort of like in politics: the opinion polls stop mattering once the actual vote is held. Maybe there is room for a 3rd-party audit service that verifies self-published stats.

  11. Andrew, the point of Alexa, Netcraft ect… deception is that they give stats about websites inaccurately. Thus boosting the market value of a website unfairly.

    Of course this will not effect e-commerce sites, but it can boost the value of something like YouTube.com from 100 million to 2.5 billion in a matter of a couple months. Large sites use this fact to their advantage, and from what I’ve heard they mostly cheat.

    Even boosting the value of a website by 1% on a billion dollar scale comes out to a million bucks. That’s a LOT of money. And that’s why these people are insanely rich. They are the best at making people perceive value in something that may not have very much. Manipulating Alexa and other sites like that are only 1 tool in the toolkit.

    I don’t have the means to get an infrastructure to perform something like that. If I could, I probably know all the dirty tricks they use to push it all the way up.

  12. Andrew, the point of Alexa, Netcraft ect… deception is that they give stats about websites inaccurately. Thus boosting the market value of a website unfairly.

    Of course this will not effect e-commerce sites, but it can boost the value of something like YouTube.com from 100 million to 2.5 billion in a matter of a couple months. Large sites use this fact to their advantage, and from what I’ve heard they mostly cheat.

    Even boosting the value of a website by 1% on a billion dollar scale comes out to a million bucks. That’s a LOT of money. And that’s why these people are insanely rich. They are the best at making people perceive value in something that may not have very much. Manipulating Alexa and other sites like that are only 1 tool in the toolkit.

    I don’t have the means to get an infrastructure to perform something like that. If I could, I probably know all the dirty tricks they use to push it all the way up.

  13. I’ve recently felt like I was one of the few “small” bloggers that cared about stats. I check mine on Google Analytics and Feedburner daily. However, there’s only 2 that really matter, in my opinion:

    1. Pageviews per visit – how involved is my reader? Are they just checking the front page or are they looking through the archives or clicking the “Read the rest of this story” link?

    2. Average time spent on my site – this is again leading back to involvement.

    Comments are a good indicator, as well. In my opinion, and from a marketing standpoint, I think that I’d rather have 500 involved readers who actually “experience” my content rather than 1000 people who pull up my front page, don’t even scroll down, and then leave.

    Of course I don’t make any money to speak of just yet, but from a personal satisfaction standpoint, that’s what’s important to me.

  14. I’ve recently felt like I was one of the few “small” bloggers that cared about stats. I check mine on Google Analytics and Feedburner daily. However, there’s only 2 that really matter, in my opinion:

    1. Pageviews per visit – how involved is my reader? Are they just checking the front page or are they looking through the archives or clicking the “Read the rest of this story” link?

    2. Average time spent on my site – this is again leading back to involvement.

    Comments are a good indicator, as well. In my opinion, and from a marketing standpoint, I think that I’d rather have 500 involved readers who actually “experience” my content rather than 1000 people who pull up my front page, don’t even scroll down, and then leave.

    Of course I don’t make any money to speak of just yet, but from a personal satisfaction standpoint, that’s what’s important to me.

  15. Some advertising services also limit access to websites with a minimum traffic number, and if Alexa et al aren’t counting all the traffic your site gets, you may not qualify if they’re using Alexa as the benchmark.

    In my case, over 50% of my traffic comes from non-IE users (Firefox, Netscape, Safari, etc.), which Alexa doesn’t count. So Alexa shows my traffic lower than it was a month ago, while Google Analytics and Sitemeter show it 50% higher. I wish one of those free services would implement an opt-in traffic rating and ranking system.

  16. Some advertising services also limit access to websites with a minimum traffic number, and if Alexa et al aren’t counting all the traffic your site gets, you may not qualify if they’re using Alexa as the benchmark.

    In my case, over 50% of my traffic comes from non-IE users (Firefox, Netscape, Safari, etc.), which Alexa doesn’t count. So Alexa shows my traffic lower than it was a month ago, while Google Analytics and Sitemeter show it 50% higher. I wish one of those free services would implement an opt-in traffic rating and ranking system.

  17. Fred Wilson had a good discussion on a similar topic late last year (http://avc.blogs.com/a_vc/2006/10/whose_numbers_a.html).

    Having worked at a retail point of sale (POS) data tracking company, IRI, which was a company the CEO of Comscore founded, some insight can be culled from an industry that tried to crack this same kind of problem. Here are some thoughts for consideration:

    When Nielsen and IRI historically posted retail POS data to their clients using statical sample-projection methods, they had an ace in the hole. They had state and federally required information on total retail sales, so invariably, they could push and pull individual sales data by product and store around to ensure the totals matched. They would use panels to overlay user-details on retail sales to create a user-sales analysis that was an interesting add-in of ‘color commentary’ to the sales trends.

    Over time, as Coke and Pepsi or WalMart and Target would match their own data against these 3rd party vendors, the truth showed the sample projection methods inaccurate even at a national level, let alone on an item or store. This lead the POS data industry to go after store-level data in as many cases as possible to collect all information. As individual retailers’ internal data became more available at the store level, and as stores aggregated from regional to national, folks like WalMart found working with outside data vendors less valuable and became disinterested in fully providing their data to benefit other retailers, resulting in greatly reduced effectiveness of 3rd party POS reporting (and a the admission of the fact that national analytics for retail POS weren’t as accurate as were lead on).

    What does this mean for web traffic sample-projection methods? Problems of using appropriate sample selection methods combined with the necessary sample size to get proper per-site analytics, down to the week or day, is a massive undertaking and while these methods can report an accurate portrait over longer periods (quarterly or annually), one should be wary of using this information at such a detailed report as weekly or on smaller individual sites. As a parallel to Comscore’s goals, even when combining user information with state/federal retail sales, statisticians in retail POS couldn’t easily nail per item sales per week which is much the same frustration web analytic seekers are hunting for today.

    Comscore appears to have a system in place that can provide very insightful information at a high level. Comscore’s reporting [over longer time periods] combined with very strong in-house metrics are the making of a robust analytic approach that looks across the horizon and focuses on a solid return on marketing investment within a specific business.

  18. Fred Wilson had a good discussion on a similar topic late last year (http://avc.blogs.com/a_vc/2006/10/whose_numbers_a.html).

    Having worked at a retail point of sale (POS) data tracking company, IRI, which was a company the CEO of Comscore founded, some insight can be culled from an industry that tried to crack this same kind of problem. Here are some thoughts for consideration:

    When Nielsen and IRI historically posted retail POS data to their clients using statical sample-projection methods, they had an ace in the hole. They had state and federally required information on total retail sales, so invariably, they could push and pull individual sales data by product and store around to ensure the totals matched. They would use panels to overlay user-details on retail sales to create a user-sales analysis that was an interesting add-in of ‘color commentary’ to the sales trends.

    Over time, as Coke and Pepsi or WalMart and Target would match their own data against these 3rd party vendors, the truth showed the sample projection methods inaccurate even at a national level, let alone on an item or store. This lead the POS data industry to go after store-level data in as many cases as possible to collect all information. As individual retailers’ internal data became more available at the store level, and as stores aggregated from regional to national, folks like WalMart found working with outside data vendors less valuable and became disinterested in fully providing their data to benefit other retailers, resulting in greatly reduced effectiveness of 3rd party POS reporting (and a the admission of the fact that national analytics for retail POS weren’t as accurate as were lead on).

    What does this mean for web traffic sample-projection methods? Problems of using appropriate sample selection methods combined with the necessary sample size to get proper per-site analytics, down to the week or day, is a massive undertaking and while these methods can report an accurate portrait over longer periods (quarterly or annually), one should be wary of using this information at such a detailed report as weekly or on smaller individual sites. As a parallel to Comscore’s goals, even when combining user information with state/federal retail sales, statisticians in retail POS couldn’t easily nail per item sales per week which is much the same frustration web analytic seekers are hunting for today.

    Comscore appears to have a system in place that can provide very insightful information at a high level. Comscore’s reporting [over longer time periods] combined with very strong in-house metrics are the making of a robust analytic approach that looks across the horizon and focuses on a solid return on marketing investment within a specific business.

  19. @4

    “For ad-supported sites, it seems like the site’s own server stats should suffice. Measure and publish the actual stats, not a bunch of imperfect proxies.”

    I’m taking some time off coding rereading your comment. Tons of good stuff here.

    If people would accept people’s words on it, Jupiter research, Alexa, netcraft ect… would not make much sense.
    Just like a company reporting earnings, people want an audit of some kind by an independent. Consider that the SEC rules are a little more strict than just posting what you believe are accurate server stats.

    The imperfection comes from the fact that these companies can not wiretap. There are privacy laws that say that an ISP can not simply record and publish statistics about their users. Jupiter research, Alexa, and others don’t EVEN HAVE that level of access.

    They typically pull a test case market, either via toolbar or by other means, and do some math to scale the results back up to the general population. Not only is this totally wrong, but as I wrote in #5 most companies that can afford to cheat to blow up their value. Especially Cali SF startups living on VC money and a prayer. They don’t want to get yelled at in the next board meeting you see.

    “Sort of like in politics: the opinion polls stop mattering once the actual vote is held.”

    On the internet there is never a real vote held. There are no real metrics outside of stats held by ISPs, which will not be published.

    AOL recently had release ONLY search data:
    http://www.securityfocus.com/brief/277

    And it had people completely up in arms. Imagine if they had released their data from all their internet users as well? It would have meant disaster for them. Sadly, they and other ISPs are the only people that could semi-accurately do metrics like this.
    BTW: I still have the AOL search data because I had downloaded it before they took it offline.

  20. @4

    “For ad-supported sites, it seems like the site’s own server stats should suffice. Measure and publish the actual stats, not a bunch of imperfect proxies.”

    I’m taking some time off coding rereading your comment. Tons of good stuff here.

    If people would accept people’s words on it, Jupiter research, Alexa, netcraft ect… would not make much sense.
    Just like a company reporting earnings, people want an audit of some kind by an independent. Consider that the SEC rules are a little more strict than just posting what you believe are accurate server stats.

    The imperfection comes from the fact that these companies can not wiretap. There are privacy laws that say that an ISP can not simply record and publish statistics about their users. Jupiter research, Alexa, and others don’t EVEN HAVE that level of access.

    They typically pull a test case market, either via toolbar or by other means, and do some math to scale the results back up to the general population. Not only is this totally wrong, but as I wrote in #5 most companies that can afford to cheat to blow up their value. Especially Cali SF startups living on VC money and a prayer. They don’t want to get yelled at in the next board meeting you see.

    “Sort of like in politics: the opinion polls stop mattering once the actual vote is held.”

    On the internet there is never a real vote held. There are no real metrics outside of stats held by ISPs, which will not be published.

    AOL recently had release ONLY search data:
    http://www.securityfocus.com/brief/277

    And it had people completely up in arms. Imagine if they had released their data from all their internet users as well? It would have meant disaster for them. Sadly, they and other ISPs are the only people that could semi-accurately do metrics like this.
    BTW: I still have the AOL search data because I had downloaded it before they took it offline.

  21. I do not trust any of the existing toolbar-based analyitcs, for many reasons including the ones you gave. But Google has all the data we need, thanks to Adwords, Google toolbars in various browsers, etc. The company also has solid analytics products. But unless I am mistaken, data on the Web at large is not available to the general public. What can we do to get Google to give us the data we need, that we know it has?

  22. I do not trust any of the existing toolbar-based analyitcs, for many reasons including the ones you gave. But Google has all the data we need, thanks to Adwords, Google toolbars in various browsers, etc. The company also has solid analytics products. But unless I am mistaken, data on the Web at large is not available to the general public. What can we do to get Google to give us the data we need, that we know it has?

  23. I think each person has their own metrics that they find important to them, which is what makes it very difficult to create more general stats. For example, you’re using comments as a metric, but it looks like Wil Wheaton of Star Trek fame gets a lot more comments than you do. Should he be ranked higher? What about John Gruber of Daringfireball, who doesn’t even allow comments on his site? How should he be ranked?

    Here’s how all fair in Alexa.

    That’s how I would expect the rankings to be, btw.

    Or what about outbound links? Should a blogger be penalized for not doing as many links as another? Kottke just has a running list of links, and mostly of general interest and humor, so I wouldn’t be surprised if he was generating more outbound traffic than you.

    I don’t think we’re ever going to find a set of stats that are going to be truly satisfactory for everyone. Maybe we wise up and stop relying on ranking to determine our worth.

    Or maybe we should just have a swimsuit competition to settle the whole thing.

    Whichever. :)

  24. I think each person has their own metrics that they find important to them, which is what makes it very difficult to create more general stats. For example, you’re using comments as a metric, but it looks like Wil Wheaton of Star Trek fame gets a lot more comments than you do. Should he be ranked higher? What about John Gruber of Daringfireball, who doesn’t even allow comments on his site? How should he be ranked?

    Here’s how all fair in Alexa.

    That’s how I would expect the rankings to be, btw.

    Or what about outbound links? Should a blogger be penalized for not doing as many links as another? Kottke just has a running list of links, and mostly of general interest and humor, so I wouldn’t be surprised if he was generating more outbound traffic than you.

    I don’t think we’re ever going to find a set of stats that are going to be truly satisfactory for everyone. Maybe we wise up and stop relying on ranking to determine our worth.

    Or maybe we should just have a swimsuit competition to settle the whole thing.

    Whichever. :)

  25. Ugh, thank you for posting something about it. The reference to stats and analytics from companies in web 2.0 drives me bananas because I know it’s inaccurate. Ajax and video are only going to make it worse.

    I think we should expect better analytics in the future – our CTO at StyleDiary has interesting perspective on this.

  26. Ugh, thank you for posting something about it. The reference to stats and analytics from companies in web 2.0 drives me bananas because I know it’s inaccurate. Ajax and video are only going to make it worse.

    I think we should expect better analytics in the future – our CTO at StyleDiary has interesting perspective on this.

  27. and, I’d like to add that most really good analytics should probably be a cocktail of all the analytic sites until a better solution comes out. With the direction IP convergence is going to take the web, though, I think something more general will be the standard in the future.

  28. and, I’d like to add that most really good analytics should probably be a cocktail of all the analytic sites until a better solution comes out. With the direction IP convergence is going to take the web, though, I think something more general will be the standard in the future.

  29. I couldn’t agree more.

    These sites only give a representation of the ‘hitshare’ for the user sample that is those users with the required tool installed.

    It’s not too far from how TV stations say how many people are watching their shows. They use incredibly small samples (in the thousands – see in the uk: BARB ) using special boxes in their living rooms, where they are required to tap in which channel they are watching.

    Our whole TV scheduling and commissioning efforts are based on these ridiculously low samples.

    It’s a similar problem with podcasts, where there’s going to be less actual listens than downloads.

    But then, I bet there’s plenty of people who have a TV set on and leave the room. Hmmm.

    This is a really interesting discussion for the whole media industry. Whatever the medium and platform.

  30. I couldn’t agree more.

    These sites only give a representation of the ‘hitshare’ for the user sample that is those users with the required tool installed.

    It’s not too far from how TV stations say how many people are watching their shows. They use incredibly small samples (in the thousands – see in the uk: BARB ) using special boxes in their living rooms, where they are required to tap in which channel they are watching.

    Our whole TV scheduling and commissioning efforts are based on these ridiculously low samples.

    It’s a similar problem with podcasts, where there’s going to be less actual listens than downloads.

    But then, I bet there’s plenty of people who have a TV set on and leave the room. Hmmm.

    This is a really interesting discussion for the whole media industry. Whatever the medium and platform.

  31. Robert Mac users are always saying that there isn’t a toolbar for them, but they can always use the Search Status plugin with Firefox.

    It is actually fairly useful

    Also if one 3rd party developer can create a suitable toolbar, orthers could as well, focusing on providing different data, or functions.

  32. Robert Mac users are always saying that there isn’t a toolbar for them, but they can always use the Search Status plugin with Firefox.

    It is actually fairly useful

    Also if one 3rd party developer can create a suitable toolbar, orthers could as well, focusing on providing different data, or functions.

  33. Obviously Google has a pretty good idea of where people are going, and many sites rely on Google Analytics to track their stats. Perhaps Google can combine those two sources (Adsense, too?) into some sort of stats index on an opt-in basis.

  34. Obviously Google has a pretty good idea of where people are going, and many sites rely on Google Analytics to track their stats. Perhaps Google can combine those two sources (Adsense, too?) into some sort of stats index on an opt-in basis.

  35. I love metrics since I was a public auditor back in the early 90s. I remember beta testing Webtrends I and thinking the same thing then that I do now about how the stats just ain’t right.

    I started this InteractionMetrics web site because as Patricia points out, ajax and other non-loading technologies hurt stats bad.

    If you are interested in joining me, http://www.interactionmetrics.com – there is nothing there yet but a wiki platform but my hope is that we can create a real standard for metrics just as we have for HTML, for Microformats, for lots of other technologies. I hear people moan that the large agencies won’t go with anything new. I say hogwash and believe the agencies will go with whatever is presented.

    You have to remember, and it’s probably not wise for me to say this but many want to keep it the way it is. If we change, their ability to generate revenue from the current may decrease.

  36. I love metrics since I was a public auditor back in the early 90s. I remember beta testing Webtrends I and thinking the same thing then that I do now about how the stats just ain’t right.

    I started this InteractionMetrics web site because as Patricia points out, ajax and other non-loading technologies hurt stats bad.

    If you are interested in joining me, http://www.interactionmetrics.com – there is nothing there yet but a wiki platform but my hope is that we can create a real standard for metrics just as we have for HTML, for Microformats, for lots of other technologies. I hear people moan that the large agencies won’t go with anything new. I say hogwash and believe the agencies will go with whatever is presented.

    You have to remember, and it’s probably not wise for me to say this but many want to keep it the way it is. If we change, their ability to generate revenue from the current may decrease.

  37. Hi Robert,

    Excellent post, we also do metrics for digital media consumption, its not for websites or blogs, but for digital media (music, movies, books, television shows) which is consumed by people all around the world. There are no metrics for this and we provide this information to music labels, movie studios , publishers, advertisers for effective marketing, syndication and other purposes. It is interesting to see how the metrics demands and requirements are so different from the traditional approaches.

    Anyways check out our website at http://divinitymetrics.com

    Cheers,

    Vishal

  38. Hi Robert,

    Excellent post, we also do metrics for digital media consumption, its not for websites or blogs, but for digital media (music, movies, books, television shows) which is consumed by people all around the world. There are no metrics for this and we provide this information to music labels, movie studios , publishers, advertisers for effective marketing, syndication and other purposes. It is interesting to see how the metrics demands and requirements are so different from the traditional approaches.

    Anyways check out our website at http://divinitymetrics.com

    Cheers,

    Vishal

  39. Personally, I think stats are way overrated. Don’t get me wrong… I love Google Analytics (well, the new version; I was using StatCounter previously for accessible daily stats-viewing). But I think that our entire industry worries way too much about public accountability.

    Are pageviews important? For advertisers, yes. But why is the public pre-disclosure necessary? If it’s pageview-based, why not this:

    1) Website posts their own metrics (they’re likely to know better than 3rd party services anyway!).
    2) Impressed advertiser goes, wow, 2 million pageviews a week! Great, we’ll pay you $x for 2 million pageviews/week. If pageviews are reduced by more than 100,000, then we can get out of our contract with no penalties AND you’ll owe us $y/CPM for the shortage.

    Otherwise, aren’t RESULTS more important? What’s the quality of the mail service like? How many sales is the company making? How many new subscribers are they getting to their for-pay newsletter?

    With ajax’d pages, the pageview and raw traffic numbers are, IMHO, simply a stupid metric in many cases. We need to get off an obsession with false quantifications (“Gimme numbers, any numbers!!!!!!1″) and start caring more about the quality of the user experience, the power of the brand, the conversions, and so on.

    Sorry, Hitwise. Sorry Compete. I just don’t find your public stats to be all that useful in the overall scheme of things… even if they were 100% “accurate.”

  40. Personally, I think stats are way overrated. Don’t get me wrong… I love Google Analytics (well, the new version; I was using StatCounter previously for accessible daily stats-viewing). But I think that our entire industry worries way too much about public accountability.

    Are pageviews important? For advertisers, yes. But why is the public pre-disclosure necessary? If it’s pageview-based, why not this:

    1) Website posts their own metrics (they’re likely to know better than 3rd party services anyway!).
    2) Impressed advertiser goes, wow, 2 million pageviews a week! Great, we’ll pay you $x for 2 million pageviews/week. If pageviews are reduced by more than 100,000, then we can get out of our contract with no penalties AND you’ll owe us $y/CPM for the shortage.

    Otherwise, aren’t RESULTS more important? What’s the quality of the mail service like? How many sales is the company making? How many new subscribers are they getting to their for-pay newsletter?

    With ajax’d pages, the pageview and raw traffic numbers are, IMHO, simply a stupid metric in many cases. We need to get off an obsession with false quantifications (“Gimme numbers, any numbers!!!!!!1″) and start caring more about the quality of the user experience, the power of the brand, the conversions, and so on.

    Sorry, Hitwise. Sorry Compete. I just don’t find your public stats to be all that useful in the overall scheme of things… even if they were 100% “accurate.”

  41. Great thought provoking post Robert and some great comments too. On my own blog I use four different types of measurement but none are very useful.

    1. Feedburner RSS readers stats which I hate because they significantly fluctuate daily but it does allow others to compare a variety of blogs displaying Feedburner numbers.

    2.Google Analytics/SiteMeter. I use these to measure my page impressions and vistors etc. but they are not perfect as previous commentors have said above.

    3. Technorati Authority: My Technorati authority is 526 and Robert yours is 5,731 – this is a good measurement but not everyone displays their authority number.

    4.Conversational Index is my personal favourite measurement. Being a blog it should be a “naked” conversation and Stowe Boyd recently talked about measuring the number of comments/posts. On my site my Conversational Index is around 4.8

    I guess on this blog it would be nearer 30+ Of course the CI number is self-published and if it was to become an industry measurement then it would need to be trusted/audited

    I guess for most bloggers publishing their Technorati Authority or CI number would be sufficient but for a few bloggers wanting to monetise their traffic with advertising then right now the only metric advertisers find interesting is page impressions to measure CPM.

  42. Great thought provoking post Robert and some great comments too. On my own blog I use four different types of measurement but none are very useful.

    1. Feedburner RSS readers stats which I hate because they significantly fluctuate daily but it does allow others to compare a variety of blogs displaying Feedburner numbers.

    2.Google Analytics/SiteMeter. I use these to measure my page impressions and vistors etc. but they are not perfect as previous commentors have said above.

    3. Technorati Authority: My Technorati authority is 526 and Robert yours is 5,731 – this is a good measurement but not everyone displays their authority number.

    4.Conversational Index is my personal favourite measurement. Being a blog it should be a “naked” conversation and Stowe Boyd recently talked about measuring the number of comments/posts. On my site my Conversational Index is around 4.8

    I guess on this blog it would be nearer 30+ Of course the CI number is self-published and if it was to become an industry measurement then it would need to be trusted/audited

    I guess for most bloggers publishing their Technorati Authority or CI number would be sufficient but for a few bloggers wanting to monetise their traffic with advertising then right now the only metric advertisers find interesting is page impressions to measure CPM.

  43. Don’t lump in Hitwise with the others. Hitwise’s methodology is not based on toolbars or surveys or log books, it’s raw numbers they get from deep inside ISP networks. I’d trust Hitwise far more.

  44. Don’t lump in Hitwise with the others. Hitwise’s methodology is not based on toolbars or surveys or log books, it’s raw numbers they get from deep inside ISP networks. I’d trust Hitwise far more.

  45. For those that are saying “stats are overrated” and so-forth, the sad reality for an advertising-supported site that sells its own ads is that your ranking on Comscore in particular has a HUGE impact on the likelyhood that you’ll be able to close top-tier accounts. If Comscore says you don’t have the traffic to handle a campaign that wants to see a million uniques then you’re not going to win that business. At least for me that’s why this issue is so sensitive.

  46. For those that are saying “stats are overrated” and so-forth, the sad reality for an advertising-supported site that sells its own ads is that your ranking on Comscore in particular has a HUGE impact on the likelyhood that you’ll be able to close top-tier accounts. If Comscore says you don’t have the traffic to handle a campaign that wants to see a million uniques then you’re not going to win that business. At least for me that’s why this issue is so sensitive.

  47. Ummm, Nielsen Television Index “won” this debate back in 1950. We’ve been there, done that.

    The only ‘sure way’ is to pass a governmental law requiring all ISPs/Networks to give up data, and then some special commission can be appointed to filter it out. Police state spyware, but hey, accurate ratings.

  48. Ummm, Nielsen Television Index “won” this debate back in 1950. We’ve been there, done that.

    The only ‘sure way’ is to pass a governmental law requiring all ISPs/Networks to give up data, and then some special commission can be appointed to filter it out. Police state spyware, but hey, accurate ratings.

  49. Companies complaining that widget views don’t count as pageviews is rediculus. When is the last time syndicated content as traffic for Associated Press?

  50. Companies complaining that widget views don’t count as pageviews is rediculus. When is the last time syndicated content as traffic for Associated Press?

  51. I just wanted to point readers to a suggestion I made about a month ago that pertains to Robert’s timely post. In a nutshell, the suggestion is that Google allow webmasters to make public a subset of their Analytics information.

    http://techfold.com/2007/04/03/how-google-can-kill-alexa-in-one-simple-step/

    “Adding a “Sharing” option to Google Analytics and surfacing stats in “site:” searches (for those site owners who have elected the sharing option in their Analytics account) would do the job nicely. Let site owners control the degree of information shared, keep everything opt-in, and rock and roll. I know I’d share my high-level views & visits stats in a second. In addition to providing all of the value Alexa does, it would also add a layer of transparency to making ad-buys – something else I would appreciate.”

  52. I just wanted to point readers to a suggestion I made about a month ago that pertains to Robert’s timely post. In a nutshell, the suggestion is that Google allow webmasters to make public a subset of their Analytics information.

    http://techfold.com/2007/04/03/how-google-can-kill-alexa-in-one-simple-step/

    “Adding a “Sharing” option to Google Analytics and surfacing stats in “site:” searches (for those site owners who have elected the sharing option in their Analytics account) would do the job nicely. Let site owners control the degree of information shared, keep everything opt-in, and rock and roll. I know I’d share my high-level views & visits stats in a second. In addition to providing all of the value Alexa does, it would also add a layer of transparency to making ad-buys – something else I would appreciate.”

  53. @33 Rod: the same thought ocurred to me while doing some competitive research, but guess what, you can easily game GA results too! If you can mess with Alexa (see comment 4, Chris) what prevents you from doing stuff on your own site to make the urchin script go nuts? as chris says in comment 10, you really cant trust the site’s own metrics. So that option is out too…

    ISPs hold probably the only data that could count. But what is their incentive?

  54. @33 Rod: the same thought ocurred to me while doing some competitive research, but guess what, you can easily game GA results too! If you can mess with Alexa (see comment 4, Chris) what prevents you from doing stuff on your own site to make the urchin script go nuts? as chris says in comment 10, you really cant trust the site’s own metrics. So that option is out too…

    ISPs hold probably the only data that could count. But what is their incentive?

  55. Great Question.

    I’m actually researching this now to justify a corporate initiative. I’m arguing to use a combination of the following:

    Authority – based on links

    Conversation Index – See Stowe Boyd’s blog

    Feed subscriptions – much like a conversion rate in sales

    Am I nuts?

    Regards,

    Mark Krupinski

  56. Great Question.

    I’m actually researching this now to justify a corporate initiative. I’m arguing to use a combination of the following:

    Authority – based on links

    Conversation Index – See Stowe Boyd’s blog

    Feed subscriptions – much like a conversion rate in sales

    Am I nuts?

    Regards,

    Mark Krupinski

  57. I think there are three issues here that are being lumped together, and in so doing it’s creating more confusion.

    1. General Stats – The Alexas of the web will always be off based on their methodologies using panel or toolbar data. the only way to shore it up is to create a better system that can prevent cheating…but this isn’t going to happen soon.

    2. Web Analytics – Obviously in this case there are a host of competitors ranging from free to really expensive and also varying wildly in their offering. This category is the definitive source to track everything going on at your local server…but it’s not going to solve the bloggers dilemma. And it’s hard to get two systems to say the same number…but at the end of the day it doesn’t matter since you’re looking for trending and directional information for the most part.

    3. Blog Analytics – I recently did a post on my blog about the complete void within the blog analytics space…since that post I’ve heard that Google will be releasing Measure Map, a blog specific tool that seems quite powerful and takes a step beyond the current web analytics tools out there. The measure map UI was lifted for the new version of Google Analytics and is quite slick.

  58. I think there are three issues here that are being lumped together, and in so doing it’s creating more confusion.

    1. General Stats – The Alexas of the web will always be off based on their methodologies using panel or toolbar data. the only way to shore it up is to create a better system that can prevent cheating…but this isn’t going to happen soon.

    2. Web Analytics – Obviously in this case there are a host of competitors ranging from free to really expensive and also varying wildly in their offering. This category is the definitive source to track everything going on at your local server…but it’s not going to solve the bloggers dilemma. And it’s hard to get two systems to say the same number…but at the end of the day it doesn’t matter since you’re looking for trending and directional information for the most part.

    3. Blog Analytics – I recently did a post on my blog about the complete void within the blog analytics space…since that post I’ve heard that Google will be releasing Measure Map, a blog specific tool that seems quite powerful and takes a step beyond the current web analytics tools out there. The measure map UI was lifted for the new version of Google Analytics and is quite slick.

  59. Hi Robert – Take a look at Quantcast. Bunch of really smart engineers using statistics and computer science to produce really accurate, interesting results. They also have sites (including Wordress and Facebook) using their tracking Javascript code. Here’s an example:

    http://www.quantcast.com/facebook.com

  60. Hi Robert – Take a look at Quantcast. Bunch of really smart engineers using statistics and computer science to produce really accurate, interesting results. They also have sites (including Wordress and Facebook) using their tracking Javascript code. Here’s an example:

    http://www.quantcast.com/facebook.com

  61. Speaking seo traffic, do not consider redzee as a source of viable click through. We did a campaign with them and all the traffic never went past the first page. I think they are doing some shady stuff. We own a printing company, AREA Printing & Design http://www.areaprinting.com and we have instant online chat, none of the clicks ever requested a chat session. It was a waste of money.
    printing

  62. Speaking seo traffic, do not consider redzee as a source of viable click through. We did a campaign with them and all the traffic never went past the first page. I think they are doing some shady stuff. We own a printing company, AREA Printing & Design http://www.areaprinting.com and we have instant online chat, none of the clicks ever requested a chat session. It was a waste of money.
    printing

  63. I’ve always used Hitwise and it has worked out well, still the absolute best is
    Google Analytics. It takes a bit time to tweak it, but it is all worth it in the end.

  64. I’ve always used Hitwise and it has worked out well, still the absolute best is
    Google Analytics. It takes a bit time to tweak it, but it is all worth it in the end.