The Big Data Dilemma

  |  January 26, 2012   |  Comments

Collecting and storing all your digital data is a luxury that most businesses today do not have. How do you sort through the need to know and nice to know data?

Question: What data is needed, and for how long?

Answer: All of it, forever.

That was a rhetorical question posed during the opening customer keynote at the Teradata Partners in October 2011. While the question and answer were hypothetical in context, it immediately caused a flood of tweets, and as I looked around the room packed with over 3,500 attendees, I could almost see the thought bubbles emerging from the heads of many...

  • How can my business determine which data is most valuable?
  • How long should we store this "important" data?
  • What are the cost implications of collecting everything and storing it forever?
  • Is it even legal to store data in perpetuity?
  • Who in the world is going to go back multiple years and begin conducting new analysis on really old data?

Collecting and storing all your digital data is a luxury that most businesses today do not have. While data storage options are becoming more economical all the time, most businesses cannot - and should not - collect every bit, byte, or petabyte that flows through their enterprise applications. However, as the importance of digital channels grows, so too does our massive quantities of digital data. Here are a few tips to help you think through the big data dilemma in a rational manner:

Need to know vs. nice to know data. As businesses operating in the digital age, there is an understandable inclination to want to collect every piece of information possible. Especially, because we have the means to do so with advanced data collection tools, massively scalable storage environments, and options that parse data to disparate systems across the enterprise. Yet, most businesses don't effectively use the data they already collect, and adding more information to the mix doesn't help matters.

Understanding what data matters to your business requires empathizing with business stakeholders, examining marketing programs, and getting to the mission-critical values of the organization. In my experience, I've found that simply asking business stakeholders what metrics or KPIs are most important to them is a futile endeavor. For starters, they don't speak our language of analytics, and even those who do are hard-pressed to articulate their business needs in neat, metric-sized bites. The task of discerning which data matter involves investigation, collaboration, and refinement. Oftentimes this requires stepping away from your daily grind to see the big picture or bringing in an outsider to help you see what data is right in front of you. Either way, the goal is to interpret the business needs and translate those needs by packaging them up in a way that makes the business salivate for your data, because they not only need it, but they thrive on it!

Archived vs. accessible data. The next big thing to consider after you've determined which data matters is how long do you really need to keep it active for your analysis, marketing, and business intelligence applications? Even the most egregious hoarders of digital data typically roll off data at specified intervals so that they can work with a manageable set of data and reduce processing times and storage costs. Whether this happens after 24, 36, or 60 months is dependent upon how you're using your data and in some cases what the legal requirements are for your industry.

Yet, a few things to consider are who's using the data and for what purpose? If your teams are conducting digital click-stream analysis to evaluate usability of your digital properties, each time you redesign or modify your online destinations you're introducing new variables that make historic data less comparable. In most cases, maintaining the processed high-level data is sufficient for trending purposes. While some analytics pros will make a case that data exploration requires raw data sets and volumes of historic data, they are in fact correct. Yet, my experience tells me that only a handful of companies have the time or resources to "play" with data and explore trends and anomalies because getting out the weekly report or answering a top priority request usually trumps deep data diving for fun.

Complexity vs. simplicity. One of the overarching themes at the Teradata Partners conference was the simple fact that complexity becomes more and more apparent as we grow our digital data stockpiles. This complexity stems from having multiple systems collecting data and processing information across myriad customer touch points that fire off responses at real-time speeds. To do anything less is insufficient and often times futile. Yet, consumers for the most part don't deal well with complexity. They need a simplified experience that masks the complexity of the big data business world.

Thus, your challenge becomes simplifying an incredibly complex environment by shielding the customer from an overwhelming world of statistics, algorithms, and business logic to present seamless online experiences. While showing the mechanics of a precision watch may be fascinating to some, most just want to know what time it is. Therein lies the challenge of meeting consumer demands in an empowered customer-driven ecosystem. The best that you can do is to offer an experience that reveals answers in an instant and offers multiple levels of depth for those that request more.

As the thought bubbles rose to the ceiling and more information was delivered, it wasn't long after the opening customer keynote that the real answer to the hypothetical question was delivered. The CEO, Mike Koehler stated to the audience; "Not all data is created equal. Some data is more valuable than others."

This column was originally published Oct. 6, 2011.



John Lovett

John Lovett is a veteran industry analyst and expert consultant who has spent the past decade helping organizations to measure their digital marketing activities. As a senior partner at Web Analytics Demystified, Lovett regularly consults with leading enterprises to offer strategic guidance for building innovative digital measurement programs. Lovett is also a trusted advisor to vendors within the digital measurement community. His deep industry knowledge and forward-thinking perspective help both vendors and clients alike to transcend mediocrity by changing the shape of business using strategic measurement practices.

Prior to joining Web Analytics Demystified, Lovett was a senior analyst with Forrester Research, where he was responsible for analytics and optimization technologies. Currently, Lovett is the vice president on the Board of Directors for the Web Analytics Association and has pioneered efforts like the Web Analyst's Code of Ethics with the WAA Standards Committee. He is co-founder of the Analysis Exchange program that is introducing eager students to analytics by helping non-profits with mentored analysis. Lovett is anticipating the publication of his first book, Social Media Metrics Secrets (Wiley & Sons, Summer 2011). He lives in New Hampshire with his wife, yellow lab, and three boys.

COMMENTSCommenting policy

comments powered by Disqus

Get the ClickZ Analytics newsletter delivered to you. Subscribe today!



Featured White Papers

2015 Holiday Email Guide

2015 Holiday Email Guide
The holidays are just around the corner. Download this whitepaper to find out how to create successful holiday email campaigns that drive engagement and revenue.

Three Ways to Make Your Big Data More Valuable

Three Ways to Make Your Big Data More Valuable
Big data holds a lot of promise for marketers, but are marketers ready to make the most of it to drive better business decisions and improve ROI? This study looks at the hidden challenges modern marketers face when trying to put big data to use.