Site Extension: Big Data in Practice

,   |  January 23, 2012   |  Comments

What questions can big data answer, and how can those insights be applied to digital media buying?

"Big data" promises a leap forward in online advertising sophistication, but many media buyers are struggling to bridge the gap from nice-to-know analytics to actionable insights. The claim is that massive volumes of semi-structured data coupled with highly scalable computing platforms enable advertisers to answer questions that were previously prohibitively complex, and then leverage those insights to refine media buying strategies. At a macro level, this line of logic appears sound, but putting theory into practice has many advertisers scratching their heads. What specific questions can big data answer, and how can those insights be applied to digital media buying?

One powerful application of big data analytics is the development of custom content channels to meet specific advertising goals. Imagine an advertiser who has found that impressions served on are very effective in meeting campaign objectives (high response rate, high brand recall). The advertiser wants to allocate additional budget to and would even be willing to pay a premium rate for incremental impressions. But at a certain point, won't be able to offer any additional scale. Where else should this advertiser buy ad space, and what inventory is most likely to perform well?

The traditional approach is to manually construct a custom channel of sites that are similar to by scouring the comScore 500 for sites that have motorcycle-related content. Media buyers may also use tools like Nielsen's @Plan to identify sites in related verticals like action sports and autos. These manual efforts are highly time-intensive and imprecise, and would inevitably fail to identify some high-quality sites. Additionally, deploying manually-constructed custom channels typically requires multiple iterations of testing and refinement to achieve acceptable performance. Big data analytics can provide a much more robust solution.

"Site extension" is the big data approach to this problem - a method of precisely and efficiently reaching target audiences across online display. At the most basic level, the concept of site extension is to identify a custom network of sites whose visitor populations are very similar. If a campaign is successful on one site in the network, that performance can then be extended across the other similar sites. The solution requires two key ingredients - a massive data warehouse containing information on billions of monthly impressions and a flexible analytics platform that enables event-level reporting.

Here's how this works for The site extension process first defines a "seed group" of users who frequently visit The process then iterates through every other site that the seed group visits (hundreds of thousands of them) and assigns a quality score to each, based on how effectively that site attracts the seed group. The results fall into three categories:

  • Some sites (like very rarely attract users from the seed group.
  • Other sites (like attract a large portion of the seed group, but also attract many users who aren't in the seed group.
  • Some sites (like emerge that attract an audience that is highly overlapped with the seed group, in that they attract a large portion of the seed group and those users represent a large percentage of the total visitor population. These sites belong to the network.


In some cases, these overlapping sites are obvious and would likely have been identified by a traditional manual approach. However, many sites identified by the site extension process would likely be missed by a manual approach either because the site is small (like or not obviously correlated with motorcycle enthusiasts (like In addition to improved completeness, the site extension approach has a precision advantage. Media buyers who take a manual approach to building content channels often struggle to understand which sites in the network perform best and which perform worst. The site extension process assigns a quality score to each site based on how effectively it attracts the seed group. The advertiser can make informed upfront decisions about which sites should be included in the network, and select only those sites with a high quality score.

In spite of the massive computational complexity of the site extension approach, sophisticated analytics platforms can complete the total job in minutes. The output is a rigorously defined custom channel that can be immediately applied to exchange-based campaigns. Site extension insights can also inform broader media buying decisions by prioritizing potential upfront deals and assessing appropriate pricing.

Site extension is just one of many applications of big data analytics currently being developed in the advertising space. Expect many more similar tools to emerge in the coming months. Ad tech companies are constantly fielding questions from advertisers that would have been unanswerable 12 months ago. Keep the questions coming, and we'll keep finding new ways to leverage big data to answer them.

ClickZ Live Toronto Twitter Canada MD Kirstine Stewart to Keynote Toronto
ClickZ Live Toronto (May 14-16) is a new event addressing the rapidly changing landscape that digital marketers face. The agenda focuses on customer engagement and attaining maximum ROI through online marketing efforts across paid, owned & earned media. Register now and save!*
*Early Bird Rates expire April 17.


Mael Bredeche

Mael Bredeche is a campaign strategist at Turn where he is responsible for designing and optimizing online advertising campaigns. Mael works with leading agencies and brands in implementing campaign strategy to improve scale and performance. He focuses on the development and implementation of optimization strategies and analytical techniques.

Prior to joining Turn, Mael worked in management consulting. Mael holds a bachelor of engineering degree from Columbia University.


Chris Kane

Chris Kane is an account strategist at Turn where he is responsible for designing and managing display campaigns on behalf of online advertisers. In this role, Chris works with leading advertisers and agencies to develop custom techniques for accessing biddable display inventory. He specializes in bidding logic, audience development, and frequency management.

Prior to joining Turn, Chris worked in the media and technology practice at Oliver Wyman, a global consulting firm. At Oliver Wyman, Chris worked with content owners and distributors in the U.S. and Europe to launch new digital business models. Chris holds a bachelor of engineering and master of engineering management degree from Dartmouth College.

COMMENTSCommenting policy

comments powered by Disqus

Get ClickZ Media newsletters delivered right to your inbox. Subscribe today!



Featured White Papers

ion Interactive Marketing Apps for Landing Pages White Paper

Marketing Apps for Landing Pages White Paper
Marketing apps can elevate a formulaic landing page into a highly interactive user experience. Learn how to turn your static content into exciting marketing apps.

eMarketer: Redefining Mobile-Only Users: Millions Selectively Avoid the Desktop

Redefining 'Mobile-Only' Users: Millions Selectively Avoid the Desktop
A new breed of selective mobile-only consumers has emerged. What are the demos of these users and how and where can marketers reach them?


    • Contact Center Professional
      Contact Center Professional (TCC: The Contact Center) - Hunt ValleyLooking to join a workforce that prides themselves on being routine and keeping...
    • Recruitment and Team Building Ambassador
      Recruitment and Team Building Ambassador (Agora Inc.) - BaltimoreAgora,, continues to expand! In order to meet the needs of our...
    • Design and Publishing Specialist
      Design and Publishing Specialist (Bonner and Partners) - BaltimoreIf you’re a hungry self-starter, creative, organized and have an extreme...