A/B Testing Done Right...and Wrong

  |  October 15, 2012   |  Comments

How can marketers prove the value of their display campaign to the CMO or CEO?

In an over-crowded display advertising ecosystem, all providers claim their solutions generate incremental value. But which actually do? Marketers are often at a loss. How can they prove the value of their display campaign to the CMO or CEO?

The use of A/B testing to measure incremental lift seems like the natural solution to the problem. Indeed, marketers often demand that their providers run an A/B test on their campaigns to prove their value, and the providers are all too often happy to oblige, wishing to keep the customer satisfied. In the last couple of years, this has become almost a norm, with some providers even basing their payment model on A/B test results.

The problem with that seemingly healthy trend is that it is both complicated and expensive to perform a properly controlled A/B test, while getting it wrong or producing biased results is very easy, and therefore common...

Here are some important factors you should be aware of when running A/B tests:

Volume, volume, volume. The main reason many A/B tests are not statistically valid is simple - insufficient sample size. A proper A/B test requires a massive sample. For the vast majority of advertisers, performing a tailor-made A/B test is impractical because of their inability to reach proper sample size within reasonable time and budget constraints.

Sampling bias. When some members of the population are less likely to be included in the test group than others, a biased sample is created. It is almost impossible to completely avoid sampling bias, which can creep up on your A/B testing in any number of ways. To give one concrete example, consider cookie deletion. A user in the control group who sees a blank/charity ad may be less likely to delete her cookies. That creates a bias between the control and test groups, effectively shortening the life span of cookies in the test group. As a result, the effective size of the test group relative to the control is smaller than it seems.

External factors. These can influence uplift and skew the results. For example, if you launch a retargeting campaign with a different provider than the one running your A/B test, this can create uplift in the control group and diminish the effect of the test.

Cost. One should always consider the benefits of running an A/B test against its inherent costs - dedicating ~10 percent of the media ad spend to blank/charity ads, as well as losing the uplift on the test group. If a provider offers you a cheap or cost-free A/B test, chances are high that the test is poorly designed and executed.

What Should I Do, Then?

If you are able to accumulate a massive number of unique users in a rather short period of time, and are willing to allocate a sizable budget for this purpose, a tailor-made A/B test might be the way to go (assuming, of course, that your provider has the knowledge and infrastructure to minimize and correct sampling bias).

If you are unable to generate the needed volume and still wish to judge your provider's incremental value, make sure that it is running an ongoing A/B test, aggregating results from all its advertisers to get the needed volume. Providers that have a solid data-driven approach would often do this anyway, since keeping a small proportion of the population as a randomized control group is the best way to get unbiased statistics on which novel algorithmic development can be based.

Last but not least, remember that your ability to monitor your test is limited. Even if you have a PhD in statistics, you almost never have all the data. Therefore, trust is key. It's all about creating a "win-win" provider-advertiser partnership.

Testing image on home page via Shutterstock.



Tuvik Beker

Dr. Beker is head of algorithmic research at myThings.

Tuvik is one of the first graduates of Tel-Aviv University's Excellence Program, and holds a Ph.D. in Computational Neuroscience from the Hebrew University. Formerly a Visiting Scholar at Stanford University and a Research Scientist at the University of Iowa, Dr. Beker's scientific work and publications span diverse areas ranging from machine learning and optimization, through evolutionary computation to mathematical biology and genetics.

Dr. Beker was the core founder of Soligence Corp., leading it from inception through successful VC funding, until selling its technology - scheduling optimization system for high-resolution imaging satellites - to NASA in 2008.

Since 1997, Dr. Beker has served as an algorithmic consultant to diverse companies ranging from budding startups to established corporations like Sun Microsystems and the Israeli Aircraft Industries, helping them solve tough real-world problems that pose significant computational challenges.

COMMENTSCommenting policy

comments powered by Disqus

Get ClickZ Media newsletters delivered right to your inbox. Subscribe today!



Featured White Papers

2015 Holiday Email Guide

2015 Holiday Email Guide
The holidays are just around the corner. Download this whitepaper to find out how to create successful holiday email campaigns that drive engagement and revenue.

Three Ways to Make Your Big Data More Valuable

Three Ways to Make Your Big Data More Valuable
Big data holds a lot of promise for marketers, but are marketers ready to make the most of it to drive better business decisions and improve ROI? This study looks at the hidden challenges modern marketers face when trying to put big data to use.