« »
Sep 24
2009

It’s like the fresh smell of spring in the air. A customer has just installed Web Analytics code on their site. Smiles and stargazed looks fill the room as management and team gawk over the stunning graphs and the endless variety and combination of numbers. Ohhhh those numbers! “What actionable insights can we get from our data?” they ask. As a heavenly peace descends upon the room, all is well in the world. Nothing can go wrong now…it’s all smooth sailing from here on in. All business decisions can now be based on solid data.

As the glow in the room reaches epic proportions, a faint sound begins resonating in the distance. What is that wailing sound? Someone please make it stop! Is that a human voice? As a hush falls over the crowd, the voice now clearly identifying itself as human comes through loud and clear. “Are you sure your data is accurate?” It’s the voice of that most dreaded of all phenomenon’s….the web analyst!

As our trusty friend, the web analyst points out, it’s equally important to ensure your data is accurate as it is to have the data in the first place. What’s the point of having all that cool data if it’s not accurate? Would you want to make a business decision or drive some marketing effort based off of misleading or inaccurate data?

It’s crucial to invest time and resources in checking and re-checking your data to ensure it’s as accurate as possible. Let me share a recent example to illustrate the importance of data accuracy to the wonderful world of web analytics.

I got a call from “Customer X” recently questioning why traffic to their site was so low in their Google Analytics account. After a couple of minutes of discussing the issue, we came to understand that there was roughly a 30% decline in visits over the last week. The date that the visits dropped coincided with a site outage that required a backup version of the site to be restored. We did our due diligence and made sure the GA code was present, and that the site was functioning properly, but didn’t find any other code related issues.

We asked the standard questions to determine if any offline marketing (or termination of some campaign) would have resulted in a decrease in traffic, but the answer was no.

After poking around on the site, we found that any users going to the non-www version of the domain were seeing a slightly different version of the site. This version of the site had no GA code on it. Apparently there was a DNS problem which was sending visitors to a staging environment instead of the live site. This was resolved quickly but no tangible increase in traffic was found. We still saw roughly 30% decline from previous weeks/months.

So, faced with no obvious conclusions we started diving into GA and found something interesting. It appeared that visitors using IE had reduced by 2/3 from the last week. We thought we had it nailed! Our assumption was that something in IE was preventing the GA code from loading under certain circumstances. We tested IE inside and outside, right side up and upside down, but found no anomalies whatsoever. This led us to believe this issue was just a symptom of a deeper problem.

Finally, after exhausting all normal troubleshooting procedures, a colleague of mine suggested we look at the problem from the reverse angle. Instead of assuming the data was accurate before the restore, let’s assume the data is accurate after the restore, and perhaps it was messed up before. We dove backwards in time through the data, and found a date 4 months prior, where the data had shot up 30%. Hmmmmm!!! After consulting with the customer again, we were informed that another site outage had occurred on the exact date that traffic had shot up 30%, and this outage also required a site restore from backup. The data from before the first site restore and after the second restore was exactly the same. The anomaly was the 4 months between the two site restores. Now the problem data was isolated, but what the heck was causing the data to be so off?

We asked the customer to restore a backup of the site (from the time period in question) to a separate environment for further analysis. Upon doing so, we continued our investigation. On a whim, we started checking some common code files, and to no one’s surprise, we found two versions of the same GA code being loaded. Sure enough, the second version of the GA code was in a design template file that was only being used 30% on the site.

So after hours and hours of troubleshooting, we were able to nail down the problem for this customer. Many high-5′s and manly grunts ensued and we all lived happily ever after…well at least until next time! :)

Data accuracy is the core of Web Analytics. Don’t take it lightly.

Here are some practical suggestions for assessing the accuracy of your web analytics data:

  • Conduct a periodic audit on your web analytics code.
  • Review your Google Analytics account configuration. Check your profiles, filters, segments and goals to make sure they are setup right.
  • Review external tagging.
  • Question the data (especially for sudden ups and downs).
  • Speak with other departments in your organization to confirm/deny what the data is indicating. This will help give your data some context.

Related Posts:

Tags: ,

2 Responses to “What do you mean my data isn’t accurate?”

  1. Great post,

    It is, however, possible to get too hung up on accuracy. Check out my post on the same topic:

    http://actionable-analytics.com/2009/08/essential-guide-to-data-accuracy-in-web-analytics/

  2. Kevin says:

    If you have the time / patience / API, you can automated this as well & monitor quality daily as well!

Leave a Reply