Friday, 14 September 2012

"Sampling is bad, limited, awful ..."

Here's something I get asked a lot - why does Google Analytics sample?

"That must be bad, right?!?! A 16% sample? That means you're ignoring 84% of the data?!?!?!"

That could be bad. But it most likely isn't. People spending their time investigating non-problems is a crime against efficient time management, at the least, if not a capital crime :)

A typical analysis - "16% is bad!"

OK, so let's increase the precision (and make the web slower).

And now we have (years later...)

What is that extra 0.07 pages/visit accuracy is going to tell you? In plain English, probably bugger all. Sampling has cost you about 1% (probably less) through inaccuracy.

P.s.  If that precision truly, truly critical to your business, do consider a more 'premium' solution. And you probably aren't reading this blog.

