Disdain Data Diving

Everybody tells you to go deep. Do a deep data dive on that massive warehouse of customer ones and zeros and you’re sure to come up with new insights and be able to target your audience better.


Data diving is the wrong analogy for the job.

If you’re looking for gold, you have to dig deep.
If you’re mining diamonds, you need to dig deep.
If you’re after sunken treasure, you have to dive deep.

But customer insight comes from casting your net wide and taking in a great deal of data rather than targeting a single fact.

Today’s Big Data heavy-lifting machines and software systems were built back in the day when millions of customers made millions of phone calls and each one had to be captured, stored, and found in a heartbeat. Banking and credit card transactions by the billions had to be put into safekeeping somewhere they could be added up, averaged, and recalled if need be.

We got very smart about cross-referencing things based on their row and column. How many people bought a TV in June? How much money is being taken out of ATMs in this city? How often did they order this with that?

These were questions that could not be answered without pulling the right cube from the Big Data system and waiting until the next morning for the answer.

New Questions Require New Technologies

Yes, Moore’s Law ensures that the hardware gets faster every day. But the software needs to keep pace with the sort of puzzles we’re dealing with today.

I recently wrote a white paper for Calpont Corp. on this very subject called, “Online Marketing Analytics Managing Big Data is Not Enough.” (Registration required.) In my research conversations with Calpont Chief Technology Officer Jim Tommaney, I was struck by his words when discussing this issue in terms on banner ad targeting.

Tommaney said that we’ve solved the problem of finding the needle in the haystack. The problem for today’s advertiser is finding all the needles and classifying them by similarity of their fingerprints.

The days of reaching a massive audience and hitting them with an onslaught of frequency is no longer economical nor tolerated. Noise is no longer the advertiser’s friend. Today, we must rely on signal…relevancy.

To find the right audience for your offerings today, you have to do better than:

Contextual: They are looking at a golf website
Demographic: They fall into the right age range and educational accomplishments
Behavioral: They searched for “golf clubs” and then “drivers” and then “Taylor Made Burner SuperFast”

That’s certainly deep, but casting your net wide means being able to ask highly complex queries across copious individual elements, cross-referencing a multitude of associated attributes that allow for very fine granularity while supporting near a real-time, iterative investigatory process.

In other words, we have to be able to ask tough questions of huge datasets comparing lots of factors down to a gnat’s whisker, very fast.

What impact did the header and body copy font sizes have on click-throughs, engagement, and purchase with older surfers on different ad networks at different times of day? And can you adjust that for seasonality?

This is the art of looking for similar fingerprints on a wide variety of needles across multiple haystacks. I need to cross-reference header font size, body font size, age, network, and time period to see which combination got the most click-throughs, the best engagement, and resulted in the most purchases. Oh yeah, and show me those purchases ranked by profitability.

Each of those elements has six or eight attributes, which creates a permutation correlation matrix that brings Big Hardware to its knees.

So the next time somebody tells you they love to do deep data diving, congratulate them on finding that one in a million record in that haystack in the cloud. But you need to find a million similar records across billions of instances in order to optimize your marketing. Widening your analytics is the only way.

Related reading

Flat business devices communication with cloud services isolated on the light blue background.