Predicting Consumer Behavior

Predicting people’s behavior is becoming a big business; strategies and tactics of many businesses are largely defined by this information. Artificial Intelligence, big data, algorithms, and probabilistic data structures are no longer a collection of rare buzzwords one wouldn’t ever hear outside of the NASAs, MITs, or JPLs of the world. While some of the evidence of this success is anecdotal, we hear about “achievement through algorithmic thinking” claims regularly.

Thanks to Brad Pitt, the best-known claim is still “Moneyball.” However, every once in a while one of the major retailers claims that they were able to match sales history, pricing, and demographic data to their product in order to predict the need and the timing of the price markdown. Online matchmakers like or eHarmony claim that by “crunching” your personal profile data, comments, and data points related to your prior experiences on their sites, they are able to fine-tune their algorithms and find you a soul mate…or, at the minimum, an online date. A number of game manufacturers use data-driven techniques to model a player’s improvement over time. These models allow them to alter various custom-tailored game challenges to each individual player. Last, but not least – most of you have heard that Visa claims that it can accurately predict how likely and how soon one is to get a divorce strictly based on one’s spending history and habits. While commonly used in many publications, this example might actually be totally bogus. Visa publically claimed that the company “does not track or monitor cardholder marital status, nor does it offer any service or product that predicts a potential divorce.” Nonetheless, most of us believe that such practices within credit card companies are both scientifically possible and very probable.

These techniques are providing shockingly good results in many cases. It is no secret that publishers and advertisers are increasingly interested in learning more about their audience. More and more big data algorithm-based (feel free to insert as many trendy buzzwords as you’d like) efficiency toolmakers are eager to get their hands on as much audience information as possible. Many of them would even offer their services free of charge. Hopefully all of you still remember an old saying, “If one isn’t paying for a service, one is the product, not the customer” – and make sure that no one but you can leverage your data.

These industries are just an early adopter of a number-crunching game that’s increasingly transforming many other businesses in a lot less glamorous fashion. Data-driven predictions and other algorithm-powered techniques became an everyday reality of the online publishing and advertising industries as well. How is that even possible? How can one’s Internet activities be tracked by a publisher? Most of the publishing websites require no registration and therefore shouldn’t have the ability to identify or track you. The answer is rather simple. The most common way for publishers to track activities of their audience is via browser cookies.

Although most Internet users know that in some cases cookies could lead to serious threats to their privacy, use of this technique isn’t truly regulated at this point. An increasing number of users are taking the matter into their own hands by blocking cookies or limiting and periodically deleting them. As much as I hate to be the bearer of bad news, blocking or limiting cookies doesn’t really accomplish much. Where there is a will, there is a way. Rather quickly, a new generation of audience recognition and tracking techniques became available – techniques that are better positioned to handle privacy-conscience people who limit cookies. Moreover, these techniques are a lot harder to fight or even recognize simply because they leave no persistent evidence of tagging, similar to cookies, on one’s computer.

These newly developed and ultra-sneaky techniques, commonly referred to as browser fingerprinting, are able to identify a consumer far more accurately than any cookie. Through the use of JavaScripts, these fingerprinting techniques collect significant amounts of innocent data points about your browser (browser make and model, installed plug-ins, default fonts, operating system, etc.). To most of us, these data points present little to no threat to our privacy while used individually. Unfortunately, large quantities of data points about your browser, along with the powerful data-driven algorithm, leads to your very own and incredibly unique digital fingerprint. According to Electronic Frontier Foundation’s data, good fingerprinting techniques could enable a publisher to uniquely identify you out of more than 650,000 visitors while browsing their websites. Is it even legal? Existing guidelines and regulations governing online advertising and publishing are rather ambiguous and they largely vary among different industries, geographies, and other sectors.

The techniques may change in the future, but audience tracking and statistical guessing games are here to stay. More and more publishers will continue to rely on audience learning techniques and develop a better sense of what individuals with a certain taste or behavior patterns prefer. This knowledge will continue to allow them to make predictions of how these individuals might react to a specific design, content, or ad campaign type. I truly believe that, one way or another, the majority of publishers will join this bandwagon shortly. Some will be confused more than ever by the various privacy-infringing techniques, governing legislations, or analytical complexity; others will successfully develop necessary core competencies. Some will promptly develop their own tools; others will try to leverage third parties instead. While each of their individual journeys will be different, the audience guessing game will surely become one that all of us will play.

Image on home page via Shutterstock.

Related reading