The Future of Big Data – Big Data 2.0
The future of big data isn't about numeric data points but instead about asking the deeper questions and finding out why consumers make the decisions they do.
The future of big data isn't about numeric data points but instead about asking the deeper questions and finding out why consumers make the decisions they do.
For data geeks like myself, it has been a hell of a ride. The rise of big data in marketing and media has brought sexy back. Finally, the creative directors, C-suite, and account leaders are leaning on the data scientists once again to provide deep consumer understanding and insights that are backed up and proven by actual consumers (as opposed to an eight-person panel in a Madison Avenue meeting room).
Today, clients often ask me about the future of big data and what the next step is; how can we leverage data on an even deeper level in order to extract meaningful consumer insights that go beyond where we are now? Most of the standard answers are around the ability to get data and insights in real time and from more devices than ever. While it is true that the connected homes, wearables, and connected cars will allow us to collect a much wider set of data points, I believe that this is just an extension of the existing approach.
It’s time we move beyond structured data and into the prime time of text analytics. Here’s why.
Most of the data points collected today are numerical or binary. They tell us if somebody engaged with a site, how well, how long, and where they engaged, but the data fails to tell us why. I believe the future of big data – Big Data 2.0 (to coin a term) – is not about more binary and numeric data points, but instead about asking the deeper questions. Big Data 2.0 should be focused not on what and where but on answering why. It should be concerned with getting a better understanding of the consumer’s emotional state and the decision logic, and thereby provide deeper insight into the consumers’ choices. If we focus on why instead of how often, we can create more meaningful, quality connections between consumers and brands. In other words, while numbers are great indicators of performance, focusing solely on them means brands miss the element of human connection.
Take Amazon data as an example. Amazon is filled with great numerical indicators. Its data can tell us the sales ranks (how many sold relative to category), the customer engagement (how many people shared product reviews), and their satisfaction with the product (the positive and negative reviews). All of these are great indicators, but they are still very simple and only tell a small part of the story.
Let’s assume we are a consumer packaged goods company and we want to introduce a new line of diapers into the market. We decide to look at Amazon in order to better understand which products are category leaders (sales rank and number of sales) and how the consumers like the product itself (reviews). If we analyze these metrics across all diapers, we have a Big Data 1.0 picture that tells us exactly who sells the most and what the audience favorite is.
This is not enough anymore; Big Data 2.0 needs to be about the why: Why is a particular product the most sold? Why does it have an average rating of 5?
For us, the easiest way to get started with Big Data 2.0 is to focus on the unstructured data we collect every day. This can be reviews, customer support emails, community forums, even your own CRM system. The simplest way to look at this data is through a process called text analytics.
Text analytics is a fairly straightforward process that breaks out like this:
Here’s a real-world application using our example above. We are trying to understand the diaper market. In order to not turn this into a step-by-step guide, let’s assume that we already have collected all diapers reviews as well as their qualitative indicators. That means we know what sells best and what ranks best/worst. In order to take this to the next level, we would start to extract words and phrases from the reviews. This will tell us some of the recurring patterns and their frequencies within the reviews. I actually performed this analysis by evaluating thousands of reviews and found three very actionable insights we would have never gotten to without text analytics.
As you can see, when analyzing the diaper category just on Amazon alone, Big Data 2.0 yielded insights beyond binary performance indicators. We could see the crowd favorites but did not (yet) know the “why” behind purchases, or understand the positive or negative reviews until our text analytics exercise. There are countless consumer insights to be mined from textual, unstructured data that give us the voice of the consumer, their motivations, and a deeper understanding of their purchasing behavior.
I hope the above examples and thoughts gave you some good ideas and inspiration on how to think about text analytics for your organization and projects. Start looking at your existing data, export your CRM, examine your comments on your website or product mentions in topic forums – even emails from your sales department’s inbox. It’s Big Data 2.0 time and that’s where you’ll find the gold.
If there is enough interest in this subject, I can create several articles that take a deeper look at the actual stages, processes, and tools for text analysis. Feel free to tweet me at @nxfxcom with any questions or comments.