Lies, Damn Lies, and Internet Audience Measurement

We’ve long known which audience numbers you get in this most measurable of media depends on whom you ask. A third-party ad server gives you one figure while a panel-based audience measurement firm provides a different number entirely. Perhaps it’s unsurprising the folks in the Advertising Research Foundation‘s (ARF’s) Internet reach and frequency committee posed a question to all of these folks and got answers that… differed.

What is surprising, and a bit scary, was how dramatically the numbers varied. The group was trying to determine the audience overlap between sites. Sounds simple, right? Well, answers ranged from less than 1 percent to over 10 percent. Whoa. Something’s terribly wrong.

What to do? Well, it’s the ARF, after all, so Leslie Wood, of LWR, and Roger Baron, of FCB Chicago, suggested the group conduct research, of course.

That May 2002 meeting was the start of a study that is an important step, though a baby step, toward resolving the measurement challenges that face media buyers and planners on the Internet — or at least a step toward figuring out what differences need to be reconciled. This week, at the ARF’s annual convention, Wood and Baron discussed what they’d learned to date, after gathering a week’s worth of data on two campaigns.

“The key objective was to better understand online duplication [of audiences between sites],” said Wood. “This meant comparing the reach and frequency and duplication levels from the two different types of data suppliers: server-centric, Atlas [DMT] and DoubleClick; and user-centric, NetRatings and comScore.”

It’s the first time there’s been an attempt to create an apples-to-apples comparison between these sources — an effort that may help develop reach and frequency tools to migrate traditional media buyers online.

“The industry is crying out for measures of reach and frequency,” as Wood put it, “and this study is an attempt to better understand the possibilities and obstacles to creating a methodology for estimating reach and frequency based on combining the data available from server- and user-centric measurement services.”

One major reason for the discrepancy in duplication numbers, researchers found, was a difference in the way the different players calculated site overlap. Some expressed it as a percentage of those who saw Site A versus those who saw Site B. Others divided the number of people in common by the number who saw either site, A or B. The latter method, say Baron and Wood, is more generally accepted d.

Comparing apples to apples remained a huge task. Researchers wanted to measure the reach and frequency of a particular campaign, not a Web site. This is something panel-based measurers aren’t equipped to do but which media planners may want to, eventually. So custom methodologies to accomplish this had to be developed.

Here’s what they found.

When viewing reach as a percentage of total impressions, numbers differed dramatically. For one campaign, a server-based measurer (identified as “Server 2” for anonymity) came up with 32 percent, while the panel-based folks said 50 percent (“User 1”) and 76 percent (“User 2”). No calculation differences here, but plenty of differences in the results.

On the duplication issue, things seemed a little more uniform — and more hopeful. When the measurers came up with figures for net reach as a percent of the sum of site reaches (a number that would indicate the amount of overlap between sites), the figures were within 6 percentage points of one another: 89 to 95 percent. So, there doesn’t appear to be much duplication between sites. Good news for publishers. Thankfully, there appears to be agreement on this issue.

“This may be the result of having only one week of data for this very preliminary report,” said Baron. “It will be interesting to see if this holds true over the longer period of a campaign.”

A disturbing trend emerged when the researchers asked each measurer how many sites each campaign was served to. For one campaign, the server said the ads were served to 13 sites. The panel-data providers found activity on 37 and 53 sites, respectively. This could be related to naming conventions or other discrepancies that will be reconciled as the study continues.

Hopefully, more than that will be reconciled with this research. Although it’s very early, the mere fact such a Herculean task is being undertaken is a huge accomplishment. It won’t answer all media planners’ and buyers’ questions, but it’s definitely a start.

The researchers are seeking more advertisers willing to participate in the study, ones that are running large campaigns across a variety of Web sites. I encourage ClickZ readers to assist if you can. Contact Gabe Samuels through the ARF.

Meet Pamela at ClickZ E-Mail Strategies in New York City on May 19 and 20.

Related reading