Start tapping technology tools that provide the ability to identify the key bits of data that are really important to your decisioning or messaging, and use those to create an ominichannel view of the customer.
There are many truisms around managing big data that I do think are actually true, like the one about how just because a technology solution can be used for some purpose doesn't mean that it is the best option for that purpose. I feel the same way about that truism in business that says the only way to solve big problems is to break it down into small problems and solve them in turn.
However, I'm not convinced that it's a good idea to follow advice that I often hear from several quarters, that you should "break down big data into small data" so that you can manage and understand it. Perhaps it's semantics, but the point is to keep big data big - that is where the power and opportunity lay.
I recommend tapping technology tools that provide the ability to identify the key bits of data that are really important to your decisioning or messaging, and use those in context to create an ominichannel view of the customer. This is only possible with effective management of the big data. Big data is not simply the sum of the small data parts. It's a view of the customer profile, intent, and behavior that is only possible because marketers have access to and can utilize all the data to improve the offer timing and content.
At the same time, there is a lot of big data that is useless to marketers - and it often gets captured and stored anyway. A better solution is to skim just what you need out of a big data set - while keeping the context intact. This is not new, and is increasingly available to marketers through their data warehouse, data management, or campaign management solution(s). MapReduce is a tool that helps marketers handle the unstructured and semi-structured resources that are not easy to analyze with traditional tools. Mapreduce.org defines "MapReduce" as a programming framework that "supports distributed computing on large data sets on clusters of computers" - essentially to simplify data processing across massive data sets. We hear lots of talk about Hadoop too, which is an open source version of MapReduce supplied by the Apache organization and the best known implementation of the MapReduce framework.
Unstructured or semi-structured data are things like web session logs, clickstream data, web analytics and optimization streams, social data, and other types that do not fit the "rows and columns" structure that is easy to analyze with relational database tools.
MapReduce can help sort through the masses of data and pull out the important parts. Many large data streams like web logs have a lot of data in them that has no long-term value. It doesn't make sense to spend a lot of time and processing power to upload data to a persistent location (the database) when you only need it for a short time. This is true for things like sentiment analysis or when publishing an event-based word cloud - when the event is over, the data is no longer needed, but the cloud itself is worth keeping.
Another great example of useless data getting in the way is an automated browse messaging scheme. What you really want is to comb through the entire web log, and find all customers who browsed but didn't buy. All the other data - the length of session, the other products viewed, the ads that were viewed, etc. - you don't need in order to trigger an email follow-up with the right product and offer based on the non-purchased item.
MapReduce is not a database. It has no querying power and no knowledge of what other data sets exist. It runs processes in parallel and is especially adept at pulling out small sets of data from the big data set and understanding them so they can be used as part of a larger picture. Lots of such jobs can be run at the same time and without any connection to each other - until the results get into the main database. Please note that it usually requires a specific expert to implement and optimize - many great database teams do not have this experience (yet).
Big data is just the latest generation of intimidating data sets - and tools like MapReduce can help tame big data by preprocessing it and passing important pieces on for further analysis. It lets you see and utilize small data inside the big data context. I think that is an important distinction - and opportunity.
Please comment below and let me know how your company is using various big data tools to help you manage big data insights.
Big Data image on home page via Shutterstock.
Learn Digital Marketing Insights From Leading Brands!
ClickZ Live Chicago (Nov 3-6) will deliver over 50 sessions across 4 days and 10 individual tracks, including Data-Driven Marketing, Social, Mobile, Display, Search and Email. Check out the full agenda, or register and attend one of the best ClickZ events yet!
Stephanie Miller is a partner with brand and marketing technology strategy firm TopRight Partners. She is a relentless customer advocate and a champion for marketers creating memorable customer experiences. A digital marketing and CRM expert, she helps sophisticated marketers balance the right mix of people, process, and technology to optimize a data-driven content marketing strategy. She speaks and writes regularly and leads several industry-wide initiatives. Feedback and column ideas most welcome, to smiller AT toprightpartners DOT com or @stephanieSAM.
Hong Kong, October 21-22
London, November 13-14
San Francisco, November 13-14
London, November 18-19
Google My Business Listings Demystified
To help brands control how they appear online, Google has developed a new offering: Google My Business Locations. This whitepaper helps marketers understand how to use this powerful new tool.
5 Ways to Personalize Beyond the Subject Line
82 percent of shoppers say they would buy more items from a brand if the emails they sent were more personalized. This white paper offer five tactics that will personalize your email beyond the subject line and drive real business growth.
October 23, 2014
1:00pm ET/10:00am PT