What's the difference between a camel and a camel? One type has one hump and another type has two. So what does a camel have to do with analyzing the distribution of data? Not much, really -- other than the humps.
Web sites generate a tremendous amount of data about how audiences react and customers perform -- so much data that it's hard to find time to analyze it and turn it into valuable information. In addition to the volume of data collected, many of the data types collected are unique to the Internet, such as Web server and email logs, which can make figuring out how to analyze the data challenging. Fortunately, many of the data analysis tools and techniques that have been around for many years can be used to turn data into insight.
One of the most unusual aspects of data about people and nature is its uneven distribution.
The first time many people are exposed to this phenomenon is in school, when an instructor predicts how grades will be distributed: A few people will make the highest grades, a few will receive the lowest grades, but most will have grades near the average. In other words, a graph of grades is normally shaped like a bell. The bell curve, technically called a "normal distribution," turns out to describe how so many things in life are distributed.
What's Height Got to Do With It?
Although a few men are significantly shorter and a few are substantially taller than the average for all men, the height of most adult men is near that average. The height of women also has a normal distribution, but around a slightly lower average.
So what does the height of anyone have to do with improving the targeting of online marketing? It's not the height that we're interested in. We're interested in the shape of the curve that shows how data is distributed -- especially when it doesn't follow the normal bell shape.
For instance, if you survey all visitors to your Web site and ask them their height, you might be surprised to find that the data does not follow a normal distribution. Instead of having the standard single hump that a normal distribution has, a graph of this data would have two humps, that is, a bimodal distribution.
We would expect data about height to be normally distributed. So a bimodal distribution graph of height gives us a clue that more analysis is needed to identify what's causing the unusual pattern in the data.
In this example, we would need to ask for the height and gender of each person to see the two underlying normal distributions. Without knowing the gender, we wouldn't be able to tell if a survey response was from a tall woman or a medium-height man.
What Marketers Want to See
Although many activities are normally distributed, some activities that online marketers deal with actually have a bimodal distribution. Computer usage at many companies is usually the heaviest at midmorning and midafternoon, with a dip in usage during lunch. This means computer usage follows a bimodal distribution.
However, many Web sites receive traffic from multiple time zones, which means the dip in midday usage from one time zone occurs at the same time as the heavy midmorning usage from another time zone. When Web usage from several different time zones is added together, the data takes on a uniform distribution.
A system administrator might like to see a uniform distribution of server usage because it means servers are operating at a high efficiency. However, marketers should view a uniform distribution as a clue that more analysis is needed to uncover trends and patterns that are being obscured.
To spot the hidden behavior patterns in the data, it's easy in some cases to use database commands to select a subset of the data to analyze. In other situations, more sophisticated analysis techniques are necessary to drill down into the data and identify meaningful results.
Online marketing is a complex collection of activities that raise awareness, provide information, and lead customers toward making a purchase. While we're applying our marketing craft to generate sales, customers are evaluating multiple needs and values to make a purchase decision.
The complexities of consumer behavior go beyond knowing which Web pages were seen. We need to know why those Web pages were selected in the first place.
We can never know exactly why a person takes each of the actions he or she takes. However, we can use more sophisticated data analysis techniques to turn clues about behavior into a better understanding of our Web visitors and customers.
Want to learn more?
Attend ClickZ Live New York March 30 - April 1. With over 15 years' experience delivering industry-leading events, ClickZ Live brings together over 60 expert speakers to offer an action-packed, educationally-focused agenda covering all aspects of digital marketing. Register today!
Cliff Allen is President of Coravue, a company that provides content management software and application service provider (ASP) hosting for Web and email. Allen is coauthor of three books about Internet marketing, including the "One-to-One Web Marketing, Second Edition" (John Wiley & Sons, 2001).
Singapore, 5-6 March
Bangkok, 17-18 March
Hong Kong, April 2015
A Buyer's Guide to Affiliate Management Software
Manage your performance marketing with the right solution. Choose a platform that will mutually empower advertisers and media partners!
Google My Business Listings Demystified
To help brands control how they appear online, Google has developed a new offering: Google My Business Locations. This whitepaper helps marketers understand how to use this powerful new tool.
March 3, 2015
1:00pm ET/10:00am PT