I was on a panel recently where I claimed that for 99 percent of email marketers, big data is irrelevant. I got some hefty pushback, but I stand by the claim. To clarify, let’s start by talking about what big data is, and perhaps more importantly is not.
According to Wikipedia, “Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.” And the reason there’s so much buzz around big data is the things that people are aiming to do with it. Specifically: crowdsourcing, data fusion and integration, genetic algorithms, machine learning, natural language processing, time series analysis, and visualization.
As a rule of thumb you can say that if you’re working on it with Excel, it’s small data. If you need a SQL database system (or systems) it’s medium data and if you need a team running a no-SQL, cloud-based cluster running MongoDB, Hadoop, Splunk, or some MapReduce framework, it’s big data.
From these definitions it should become immediately apparent that the overwhelming majority of email marketers and marketing problems simply don’t require a big data solution. And that’s a good thing. Big data for big data’s sake has the potential to be an enormous boondoggle for many organizations.
I’m not so jaded as to dismiss big data entirely. As the curly fry correlation shows, we can perform some amazing predictions from entirely unexpected data. (The curly fry correlation being that liking curly fries on Facebook is extremely predictive of high intelligence – Jennifer Golbeck’s hugely enlightening TED talk explains why.) This type of analysis can be very valuable, but it’s only possible if you have both the data and the ability to properly mine it, and very few do.
This I think is where things have fallen down for big data. Oftentimes what isn’t big data is sold as such because buzzwords sell. Then, as is common with new buzzwords, there have been unrealistic expectations. We expect the curly fry correlation to tell us something deep about curly fries or about intelligence and when it doesn’t it leaves us unsatisfied and disillusioned. The backlash against big data has reached the point where many dismiss it as snake oil. There’s even a small data movement focused on data sets small enough to enable meaningful human comprehension and visualization.
Rather than going with either the hype or the backlash, I recommend a different course entirely. I believe we should aim for pragmatic data. We’re here to do marketing. Forget the overblown claims and hype of the latest shiny object but don’t dismiss new ideas just because they’re hyped. Remember the Pareto principle (the 80/20 rule) – 80 percent of the effects come from 20 percent of the causes.
If you’re building effective segment models and personas in Excel, then more power to you. You’re likely achieving most of the value for a fraction of the cost of a larger system and doing so in far less time, which is a huge win in my book.
If you have a larger organization and you’re pulling together your customer data from disparate sources into a single database and then using tools to analyze behaviors and identify segments and offers you don’t need to justify it by calling it big data.
Finally, if you’re actually mining petabytes of data to discern deep correlations and thereby accurately predict consumer behavior, you’re one of the few. I’m impressed, slightly scared of what you’ll learn, and suspect your name is either Larry or Jeff. No doubt you’re driving incremental value and that’s great.
In all these, be pragmatic. Email, more than most disciplines, suffers from a lack of resources. Focus on doing what works and doing so as efficiently as possible – as quickly as you can with the minimum of resources and cost. Do this and you’ll achieve far more with far less than anyone expects and then no one will care how big your data is.
Images via Shutterstock.