Site Extension: Big Data in Practice

,   |  January 23, 2012   |  Comments

What questions can big data answer, and how can those insights be applied to digital media buying?

"Big data" promises a leap forward in online advertising sophistication, but many media buyers are struggling to bridge the gap from nice-to-know analytics to actionable insights. The claim is that massive volumes of semi-structured data coupled with highly scalable computing platforms enable advertisers to answer questions that were previously prohibitively complex, and then leverage those insights to refine media buying strategies. At a macro level, this line of logic appears sound, but putting theory into practice has many advertisers scratching their heads. What specific questions can big data answer, and how can those insights be applied to digital media buying?

One powerful application of big data analytics is the development of custom content channels to meet specific advertising goals. Imagine an advertiser who has found that impressions served on are very effective in meeting campaign objectives (high response rate, high brand recall). The advertiser wants to allocate additional budget to and would even be willing to pay a premium rate for incremental impressions. But at a certain point, won't be able to offer any additional scale. Where else should this advertiser buy ad space, and what inventory is most likely to perform well?

The traditional approach is to manually construct a custom channel of sites that are similar to by scouring the comScore 500 for sites that have motorcycle-related content. Media buyers may also use tools like Nielsen's @Plan to identify sites in related verticals like action sports and autos. These manual efforts are highly time-intensive and imprecise, and would inevitably fail to identify some high-quality sites. Additionally, deploying manually-constructed custom channels typically requires multiple iterations of testing and refinement to achieve acceptable performance. Big data analytics can provide a much more robust solution.

"Site extension" is the big data approach to this problem - a method of precisely and efficiently reaching target audiences across online display. At the most basic level, the concept of site extension is to identify a custom network of sites whose visitor populations are very similar. If a campaign is successful on one site in the network, that performance can then be extended across the other similar sites. The solution requires two key ingredients - a massive data warehouse containing information on billions of monthly impressions and a flexible analytics platform that enables event-level reporting.

Here's how this works for The site extension process first defines a "seed group" of users who frequently visit The process then iterates through every other site that the seed group visits (hundreds of thousands of them) and assigns a quality score to each, based on how effectively that site attracts the seed group. The results fall into three categories:

  • Some sites (like very rarely attract users from the seed group.
  • Other sites (like attract a large portion of the seed group, but also attract many users who aren't in the seed group.
  • Some sites (like emerge that attract an audience that is highly overlapped with the seed group, in that they attract a large portion of the seed group and those users represent a large percentage of the total visitor population. These sites belong to the network.


In some cases, these overlapping sites are obvious and would likely have been identified by a traditional manual approach. However, many sites identified by the site extension process would likely be missed by a manual approach either because the site is small (like or not obviously correlated with motorcycle enthusiasts (like In addition to improved completeness, the site extension approach has a precision advantage. Media buyers who take a manual approach to building content channels often struggle to understand which sites in the network perform best and which perform worst. The site extension process assigns a quality score to each site based on how effectively it attracts the seed group. The advertiser can make informed upfront decisions about which sites should be included in the network, and select only those sites with a high quality score.

In spite of the massive computational complexity of the site extension approach, sophisticated analytics platforms can complete the total job in minutes. The output is a rigorously defined custom channel that can be immediately applied to exchange-based campaigns. Site extension insights can also inform broader media buying decisions by prioritizing potential upfront deals and assessing appropriate pricing.

Site extension is just one of many applications of big data analytics currently being developed in the advertising space. Expect many more similar tools to emerge in the coming months. Ad tech companies are constantly fielding questions from advertisers that would have been unanswerable 12 months ago. Keep the questions coming, and we'll keep finding new ways to leverage big data to answer them.


ClickZ Live New York Want to learn more?
Attend ClickZ Live New York March 30 - April 1. With over 15 years' experience delivering industry-leading events, ClickZ Live brings together over 60 expert speakers to offer an action-packed, educationally-focused agenda covering all aspects of digital marketing. Register today!


Mael Bredeche

Mael Bredeche is a campaign strategist at Turn where he is responsible for designing and optimizing online advertising campaigns. Mael works with leading agencies and brands in implementing campaign strategy to improve scale and performance. He focuses on the development and implementation of optimization strategies and analytical techniques.

Prior to joining Turn, Mael worked in management consulting. Mael holds a bachelor of engineering degree from Columbia University.


Chris Kane

Chris Kane is an account strategist at Turn where he is responsible for designing and managing display campaigns on behalf of online advertisers. In this role, Chris works with leading advertisers and agencies to develop custom techniques for accessing biddable display inventory. He specializes in bidding logic, audience development, and frequency management.

Prior to joining Turn, Chris worked in the media and technology practice at Oliver Wyman, a global consulting firm. At Oliver Wyman, Chris worked with content owners and distributors in the U.S. and Europe to launch new digital business models. Chris holds a bachelor of engineering and master of engineering management degree from Dartmouth College.

COMMENTSCommenting policy

comments powered by Disqus

Get ClickZ Media newsletters delivered right to your inbox. Subscribe today!




Featured White Papers

A Buyer's Guide to Affiliate Management Software

A Buyer's Guide to Affiliate Management Software
Manage your performance marketing with the right solution. Choose a platform that will mutually empower advertisers and media partners!

Google My Business Listings Demystified

Google My Business Listings Demystified
To help brands control how they appear online, Google has developed a new offering: Google My Business Locations. This whitepaper helps marketers understand how to use this powerful new tool.


    • Website Optimizer - SEO, CRO, Analytics
      Website Optimizer - SEO, CRO, Analytics (Marcel Digital) - ChicagoMarcel Digital, an award winning interactive marketing agency established in 2003...
    • Director of Marketing
      Director of Marketing (Patron Technology) - New YorkDirector of Marketing We are seeking a Director of Marketing to manage and build our marketing...
    • Senior Interactive Producer
      Senior Interactive Producer (Ready Set Rocket) - New YorkWhat You'll Do As a member of our team, the Senior Producer reports directly to our...