Site Extension: Big Data in Practice

,   |  January 23, 2012   |  Comments

What questions can big data answer, and how can those insights be applied to digital media buying?

"Big data" promises a leap forward in online advertising sophistication, but many media buyers are struggling to bridge the gap from nice-to-know analytics to actionable insights. The claim is that massive volumes of semi-structured data coupled with highly scalable computing platforms enable advertisers to answer questions that were previously prohibitively complex, and then leverage those insights to refine media buying strategies. At a macro level, this line of logic appears sound, but putting theory into practice has many advertisers scratching their heads. What specific questions can big data answer, and how can those insights be applied to digital media buying?

One powerful application of big data analytics is the development of custom content channels to meet specific advertising goals. Imagine an advertiser who has found that impressions served on are very effective in meeting campaign objectives (high response rate, high brand recall). The advertiser wants to allocate additional budget to and would even be willing to pay a premium rate for incremental impressions. But at a certain point, won't be able to offer any additional scale. Where else should this advertiser buy ad space, and what inventory is most likely to perform well?

The traditional approach is to manually construct a custom channel of sites that are similar to by scouring the comScore 500 for sites that have motorcycle-related content. Media buyers may also use tools like Nielsen's @Plan to identify sites in related verticals like action sports and autos. These manual efforts are highly time-intensive and imprecise, and would inevitably fail to identify some high-quality sites. Additionally, deploying manually-constructed custom channels typically requires multiple iterations of testing and refinement to achieve acceptable performance. Big data analytics can provide a much more robust solution.

"Site extension" is the big data approach to this problem - a method of precisely and efficiently reaching target audiences across online display. At the most basic level, the concept of site extension is to identify a custom network of sites whose visitor populations are very similar. If a campaign is successful on one site in the network, that performance can then be extended across the other similar sites. The solution requires two key ingredients - a massive data warehouse containing information on billions of monthly impressions and a flexible analytics platform that enables event-level reporting.

Here's how this works for The site extension process first defines a "seed group" of users who frequently visit The process then iterates through every other site that the seed group visits (hundreds of thousands of them) and assigns a quality score to each, based on how effectively that site attracts the seed group. The results fall into three categories:

  • Some sites (like very rarely attract users from the seed group.
  • Other sites (like attract a large portion of the seed group, but also attract many users who aren't in the seed group.
  • Some sites (like emerge that attract an audience that is highly overlapped with the seed group, in that they attract a large portion of the seed group and those users represent a large percentage of the total visitor population. These sites belong to the network.


In some cases, these overlapping sites are obvious and would likely have been identified by a traditional manual approach. However, many sites identified by the site extension process would likely be missed by a manual approach either because the site is small (like or not obviously correlated with motorcycle enthusiasts (like In addition to improved completeness, the site extension approach has a precision advantage. Media buyers who take a manual approach to building content channels often struggle to understand which sites in the network perform best and which perform worst. The site extension process assigns a quality score to each site based on how effectively it attracts the seed group. The advertiser can make informed upfront decisions about which sites should be included in the network, and select only those sites with a high quality score.

In spite of the massive computational complexity of the site extension approach, sophisticated analytics platforms can complete the total job in minutes. The output is a rigorously defined custom channel that can be immediately applied to exchange-based campaigns. Site extension insights can also inform broader media buying decisions by prioritizing potential upfront deals and assessing appropriate pricing.

Site extension is just one of many applications of big data analytics currently being developed in the advertising space. Expect many more similar tools to emerge in the coming months. Ad tech companies are constantly fielding questions from advertisers that would have been unanswerable 12 months ago. Keep the questions coming, and we'll keep finding new ways to leverage big data to answer them.


ClickZ Live Toronto On the heels of a fantastic event in New York City, ClickZ Live is taking the fun and learning to Toronto, June 23-25. With over 15 years' experience delivering industry-leading events, ClickZ Live offers an action-packed, educationally-focused agenda covering all aspects of digital marketing. Early Bird Rates expire May 29. Register today and save!


Mael Bredeche

Mael Bredeche is a campaign strategist at Turn where he is responsible for designing and optimizing online advertising campaigns. Mael works with leading agencies and brands in implementing campaign strategy to improve scale and performance. He focuses on the development and implementation of optimization strategies and analytical techniques.

Prior to joining Turn, Mael worked in management consulting. Mael holds a bachelor of engineering degree from Columbia University.


Chris Kane

Chris Kane is an account strategist at Turn where he is responsible for designing and managing display campaigns on behalf of online advertisers. In this role, Chris works with leading advertisers and agencies to develop custom techniques for accessing biddable display inventory. He specializes in bidding logic, audience development, and frequency management.

Prior to joining Turn, Chris worked in the media and technology practice at Oliver Wyman, a global consulting firm. At Oliver Wyman, Chris worked with content owners and distributors in the U.S. and Europe to launch new digital business models. Chris holds a bachelor of engineering and master of engineering management degree from Dartmouth College.

COMMENTSCommenting policy

comments powered by Disqus

Get ClickZ Media newsletters delivered right to your inbox. Subscribe today!



Featured White Papers

Gartner Magic Quadrant for Digital Commerce

Gartner Magic Quadrant for Digital Commerce
This Magic Quadrant examines leading digital commerce platforms that enable organizations to build digital commerce sites. These commerce platforms facilitate purchasing transactions over the Web, and support the creation and continuing development of an online relationship with a consumer.

Paid Search in the Mobile Era

Paid Search in the Mobile Era
Google reports that paid search ads are currently driving 40+ million calls per month. Cost per click is increasing, paid search budgets are growing, and mobile continues to dominate. It's time to revamp old search strategies, reimagine stale best practices, and add new layers data to your analytics.



    • SEO Specialist
      SEO Specialist (HeBS Digital) - NEW YORK                             ...
    • GREAT Campaign Project Coordinator
      GREAT Campaign Project Coordinator (British Consulate-General, New York) - New YorkThe GREAT Britain Campaign is seeking an energetic and creative...
    • Paid Search Senior Account Manager
      Paid Search Senior Account Manager (Hanapin Marketing) - BloomingtonHanapin Marketing is hiring a strategic Paid Search Senior Account Manager...