Data Mining and Predictive Analytics, Part 1

Last time, I outlined my belief that what we call “Web analytics” is becoming a more diverse, complex field. What we’ve traditionally considered Web analytics is the analysis of site behavioral data captured, processed, and reported on by a proprietary system designed to do just that. But as the online channel evolves and becomes more complex, tools that help us understand what’s happening must also evolve and become more complex. In some areas, such as in social media, this may mean developing new tools. In other areas, it may mean applying old tools to a new channel.

One area we work in a great deal is the use of data-mining (define) and predictive analytical techniques. I got started in this area about 15 years ago when I used these types of methodologies at ACNielsen to help clients figure out which half of their advertising money they were wasting. I have a book on my shelf published 25 years ago on the use of model-building techniques in marketing. So the techniques are hardly new, but what is relatively new is the systematic use of these techniques in the online marketing space.

There are a few reasons for this. Historically, our main concern has been managing vast volumes of data and wrestling out of the Web analytics systems a few numbers that told us how well we were doing (and which we could do something about). Also, organic growth in the channel meant we weren’t forced to scramble for market share and to fully optimize our business processes. And to some extent, we weren’t asking the right questions.

This is changing. We understand our few numbers, and we want to know more. The online world is far more competitive, and we’re beginning to ask questions that go beyond the limits of our traditional analytical tool set. Questions such as:

  • How do I understand the effects different marketing channels have on generating sales?
  • What does the purchase lifecycle look like over multiple visits, and how can I optimize it?
  • How should I segment my audience or customers to improve my marketing activity’s effectiveness?
  • To answer these questions, we have to organize the data in different ways, and we need to bring in different tools. First, we must integrate our data so we can see different aspects of the acquisition, conversion, and retention processes in one place. Second, we must aggregate our data so it focuses on the visitor or customer rather than the click or visit. Third, we must cut through the noise in data using more sophisticated analytical techniques to get at the key insights.

    Let me give you an example of what I mean.

    Different types of people come to our Web sites for different reasons and to do different things. If I treat everyone the same way, I’m not making an optimal decision about how I allocate marketing funds and how I manage the user experience. I need to segment my audience so I can market to these different groups more effectively. However, I can’t do that on the basis of how they behave on the Web site alone. I also have to understand their demographics, intentions, aspirations, and opinions. I must integrate my hardcore behavioral data with profiling and attitudinal data drawn from other data sources, like surveys.

    Next, I’m interested in visitors’ behavior over multiple visits rather than what they do in a single visit. I aggregate data so that I have a record of the behavior of different visitors over time. I probably also need to summarize the data and create additional attributes that describe aspects of that behavior over time, such as number of visits made, number of conversions events, types of conversion events, and so on.

    Finally, I must analyse the data to identify interesting and meaningful visitor segments. In all likelihood, I’ll probably have a large, noisy dataset in which I won’t be able to see the forest for all the trees. Traditional querying and reporting techniques are unlikely to be an effective method of identifying the patterns, so I need to use something that will find patterns in the data for me.

    In this case, I decide to use cluster analysis (define). The cluster analysis process looks for groups of visitors in the data, where the people within the groups have something in common but the commonality is different from group to group. I then interpret that data to understand what the visitor segments have been clustered on and decide whether these are meaningful and useful segments I can do something with. This process may yield some surprising results and enable me to think about the audience in a way I hadn’t previously. I may find patterns and relationships in the data I would never have found using traditional analysis techniques.

    Using data-mining and predictive analytical techniques allows organizations to unlock more value from their data, but it requires a different approach to managing data, different tools, and different skills.

    In part two, I’ll look at another application of data mining and predictive analytics: to understand what the important factors are that affect someone’s propensity to buy something during the purchase lifecycle.

    Till then…

    Related reading

    tencent_emily-ma_featured-image
    kenneth_ning_emarsys_featured-image
    bounce-370x229
    site search hp
    <