That Sneaking Impression: Part 2

Last week we dove into the sordid technical details in the life of an impression. It wasn’t very pretty. This week we need to sort out some of the implications.

Are Clients Being Cheated?

Yes and no. The fact that sites’ numbers tend to be untruthfully high, strangely, does not harm the client. Ultimately, the forces of supply and demand will pay the right sites the right fraction of the media dollars out there (assuming they all use equally corrupted numbers).

I believe clients are getting cheated out of a different kind of value: the value that we agency folks promise them when we get them into interactive in the first place. We go on and on about how accountable the medium is how we can see real-time performance and react with changes to creative and media choices. But, in fact, our numbers are so noisy in the end that we can’t really draw very good conclusions about a lot of our data.

The value lost isn’t the fictional impressions and clicks that we tell our clients they received. The value lost is the lack of our ability to use this fictitious data in a useful way.

The Interactive Promise

I first got into interactive marketing because I thought it would be really neat to have so much control. Because of all the data I’d collect, I’d be able to tell just what type of performance boost I’d get if I used this type of creative versus that type of creative, on Thursdays, with a certain product category, targeting left-handers.

But the reality was that our numbers were so polluted that we could barely tell if we were performing better or worse than last month across the whole audience. Without very fine data, we couldn’t draw very fine conclusions. And with data that deviated plus/minus 5 percent with each additional server added to the chain, we couldn’t even make very broad conclusions.

I used to think this was merely a scaling issue. Back at Leo Burnett and at J. Walter Thompson, I kept thinking that we just needed to spend more money online to begin to see these trends and patterns appear in our data. But they never did. Even when buying $40 million for Microsoft, we saw only very general truths, and we discovered then that it was due to our poor data.

Server Fights

A big issue of late has been the battle between sites and agencies to determine whose numbers should be “blessed” in the end. The battle seems to be about which numbers are superior, but to be honest, there’s a bigger issue going on here.

If a site is going to automate its internal workings, like ad trafficking, it must be serving its own ads so as to make the numbers automatically flow through its database systems. Likewise, if the agency is going to get close to profitable with its interactive operations, it requires similar automation.

Thus, both sides insist that their numbers should be employed as the gold standard at the end of a campaign. Neither wants to be stuck entering 100 faxes into an Excel spreadsheet just to determine potential discrepancies.

So far, sites have largely won this battle, forcing agencies to bear the brunt of collecting different numbers via different methods from different sites, then compiling it all for the client’s postbuy presentation where everything is fudged together to pretend it’s all apples to apples. And just to add some sand in the face, the site’s numbers tend to be exaggerated relative to the agency’s numbers, mostly due to the innocent fact that the site’s servers reside at the beginning of that server chain.

What Next?

If we limit our ambitions to merely watching obvious trends, like rough correlations between our banner ads and concurrent online sales, then these banner serving systems we use might be just the ticket. But if we aspire to bring online marketing to the level of a true discipline, we must become much more critical of our data. Even if we want to deliver the value that traditional direct marketing firms give to their clients, we need some great improvement.

In the cases where you wish to conduct proper analysis, you have to directly measure customer behavior. This generally means using what is called “client side tracking” to measure ad performance. A little Java or JavaScript applet is placed on the banner itself and reports directly to the agency database. Only this type of data will give us the semblance of truth needed to make subtle interpretations.

Ultimately, I’d like to see a reopening of the site-versus-agency debate about the definition of an impression and the standard mechanism by which one is reported. There’s no reason why we can’t all agree on a metric that reflects marketing reality. There’s less reason why we can’t put this information in a format that can be automatically digested by agency tracking and accounting systems.

Until then, I’m going to continue to use the banner servers with the rough stuff and use client side reporting on the important stuff.

Related reading