MapReduce and Marketing: Are Small Bits of Big Data Meaningful?

  |  April 1, 2013   |  Comments

Start tapping technology tools that provide the ability to identify the key bits of data that are really important to your decisioning or messaging, and use those to create an ominichannel view of the customer.

There are many truisms around managing big data that I do think are actually true, like the one about how just because a technology solution can be used for some purpose doesn't mean that it is the best option for that purpose. I feel the same way about that truism in business that says the only way to solve big problems is to break it down into small problems and solve them in turn.

However, I'm not convinced that it's a good idea to follow advice that I often hear from several quarters, that you should "break down big data into small data" so that you can manage and understand it. Perhaps it's semantics, but the point is to keep big data big - that is where the power and opportunity lay.

I recommend tapping technology tools that provide the ability to identify the key bits of data that are really important to your decisioning or messaging, and use those in context to create an ominichannel view of the customer. This is only possible with effective management of the big data. Big data is not simply the sum of the small data parts. It's a view of the customer profile, intent, and behavior that is only possible because marketers have access to and can utilize all the data to improve the offer timing and content.

At the same time, there is a lot of big data that is useless to marketers - and it often gets captured and stored anyway. A better solution is to skim just what you need out of a big data set - while keeping the context intact. This is not new, and is increasingly available to marketers through their data warehouse, data management, or campaign management solution(s). MapReduce is a tool that helps marketers handle the unstructured and semi-structured resources that are not easy to analyze with traditional tools. defines "MapReduce" as a programming framework that "supports distributed computing on large data sets on clusters of computers" - essentially to simplify data processing across massive data sets. We hear lots of talk about Hadoop too, which is an open source version of MapReduce supplied by the Apache organization and the best known implementation of the MapReduce framework.

Unstructured or semi-structured data are things like web session logs, clickstream data, web analytics and optimization streams, social data, and other types that do not fit the "rows and columns" structure that is easy to analyze with relational database tools.

MapReduce can help sort through the masses of data and pull out the important parts. Many large data streams like web logs have a lot of data in them that has no long-term value. It doesn't make sense to spend a lot of time and processing power to upload data to a persistent location (the database) when you only need it for a short time. This is true for things like sentiment analysis or when publishing an event-based word cloud - when the event is over, the data is no longer needed, but the cloud itself is worth keeping.

Another great example of useless data getting in the way is an automated browse messaging scheme. What you really want is to comb through the entire web log, and find all customers who browsed but didn't buy. All the other data - the length of session, the other products viewed, the ads that were viewed, etc. - you don't need in order to trigger an email follow-up with the right product and offer based on the non-purchased item.

MapReduce is not a database. It has no querying power and no knowledge of what other data sets exist. It runs processes in parallel and is especially adept at pulling out small sets of data from the big data set and understanding them so they can be used as part of a larger picture. Lots of such jobs can be run at the same time and without any connection to each other - until the results get into the main database. Please note that it usually requires a specific expert to implement and optimize - many great database teams do not have this experience (yet).

Big data is just the latest generation of intimidating data sets - and tools like MapReduce can help tame big data by preprocessing it and passing important pieces on for further analysis. It lets you see and utilize small data inside the big data context. I think that is an important distinction - and opportunity.

Please comment below and let me know how your company is using various big data tools to help you manage big data insights.

Big Data image on home page via Shutterstock.


ClickZ Live New York Want to learn more?
Attend ClickZ Live New York March 30 - April 1. With over 15 years' experience delivering industry-leading events, ClickZ Live brings together over 60 expert speakers to offer an action-packed, educationally-focused agenda covering all aspects of digital marketing. Register today!


Stephanie Miller

Stephanie Miller is a partner with brand and marketing technology strategy firm TopRight Partners, which helps customers use the technology they have today to do the marketing they want to do today and tomorrow. She is a relentless customer advocate and a champion for marketers creating memorable customer experiences. A digital marketing and CRM expert, she helps sophisticated marketers balance the right mix of people, process, and technology to optimize a data-driven content marketing strategy. She speaks and writes regularly and leads several industry-wide initiatives. Feedback and column ideas most welcome, to smiller AT toprightpartners DOT com or @stephanieSAM.

COMMENTSCommenting policy

comments powered by Disqus

Get the ClickZ Marketing newsletter delivered to you. Subscribe today!




Featured White Papers

A Buyer's Guide to Affiliate Management Software

A Buyer's Guide to Affiliate Management Software
Manage your performance marketing with the right solution. Choose a platform that will mutually empower advertisers and media partners!

Google My Business Listings Demystified

Google My Business Listings Demystified
To help brands control how they appear online, Google has developed a new offering: Google My Business Locations. This whitepaper helps marketers understand how to use this powerful new tool.



    • Website Optimizer - SEO, CRO, Analytics
      Website Optimizer - SEO, CRO, Analytics (Marcel Digital) - ChicagoMarcel Digital, an award winning interactive marketing agency established in 2003...
    • Director of Marketing
      Director of Marketing (Patron Technology) - New YorkDirector of Marketing We are seeking a Director of Marketing to manage and build our marketing...
    • Senior Interactive Producer
      Senior Interactive Producer (Ready Set Rocket) - New YorkWhat You'll Do As a member of our team, the Senior Producer reports directly to our...