Predictive Analytics: A Blend of Art and Science?

I’m just reading “Super Crunchers” by Ian Ayres. It offers an interesting look at how data mining and predictive analytics are becoming more widespread and are increasingly shaping our lives. Ayres cites examples where empirical approaches outperform human experts in their ability to accurately predict likely outcomes.

I particularly liked his story of an econometrician able to predict the expected quality of Bordeaux wine based on a simple regression analysis of weather data. He could predict the expected quality of a particular vintage based on just three variables: the amount of winter rainfall, the average temperature during the growing season, and the amount of rainfall during the harvest. Most interesting is the resistance and even hostility he got to his predictions from the wine establishment. The wine experts of the time were threatened and affronted by the fact that their art and expertise could be reduced to a simple equation.

Ayres provides examples from other industries where data mining and predictive analytical techniques have changed the rules of the game, from baseball scouting to social policy development to medicine. Quite often, there’s been resistance to these techniques from established experts in that field. They would not or could not accept that such empirical methods could be better than the expertise they had developed through years of training and experience. However, numerous studies cited by Ayres have shown that predictive analytics outperforms experts in predicting an outcome correctly. That doesn’t mean predictive techniques always get it right, just that they get it right more often than the experts.

In the digital marketing field, Ayres uses the example of A/B and multivariate testing. He points out that today’s volume of data and technology allows people to run repeated tests and trials to predict which versions of which page element are most likely to be successful in driving the desired outcome. Anyone familiar with multivariate testing technologies knows that the marketing stance regarding them is often that they eliminate the need for subjectivity in the design process. You just come up with some alternative versions and see which one works best. It’s the ultimate tool for overcoming bias and subjectivity of the various stakeholders involved in site development. Who needs usability testing, right?

Ayres’ background isn’t as a statistician or an analyst but as a lawyer. You don’t immediately think of lawyers as being masters of the empirical universe. Why would a lawyer be an expert in number crunching? Being a lawyer could be similar to being an analyst, though. Each tries to prove or disprove a hypothesis and looks for the appropriate evidence to support a theory or disprove somebody else’s. Thus, it’s a fallacy to believe that econometrics and predictive analytics are purely scientific disciplines.

Predictive analytics is often as much about art as it is about science. To build a good model, you must have a good understanding of the way the system you are trying to model works. More often than not at the beginning of the model building process, there’s some subjective opinion about the likely factors influencing the thing you’re trying to predict. So where do these opinions come from? They usually come from experts in that particular field. We sometimes called this the domain expertise. In the econometrician predicting wine quality example, the econometrician was also a wine buff so he possessed knowledge about the likely factors that could potentially affect a particular vintage’s quality. His skill was in quantifying it.

Likewise, some domain expertise is needed to develop good tests. If we look at multivariate testing, technology can help determine which is the best page design to use. If you test four different versions of an element, say a call to action, then you’ll get a winner. That winner may be the one you started out with, but it’s still the winner. It doesn’t mean that it’s the best one, it’s just the one that is best out of the various options you examined. There may be a much better option out there that you haven’t tested. Usability experts can potentially provide better insights into what versions are the best ones to test in the first place and help understand test results.

We need experts to help us build better models. That expertise may come from years of experience or knowledge gained from understanding previous models’ effectiveness. In either case, there’s room for both science and art.

Related reading

Big Data & Travel
Flat design modern vector illustration concept of website analytics search information.