Avoiding a Collision Between Twitter Analytics and SEO

Congratulations. You have your social media campaign up and running, and your material is being tweeted, retweeted, and otherwise consumed using nearly limitless other disenfranchised verbs. But are you sure that your setup isn’t cannibalizing your efforts on the SEO (define) front? In today’s column, I discuss some by-products of your social media campaign’s analytics and how they can be detrimental to your site’s SEO.

The Analytics/URL Paradox

Once in a while, we run into a situation in which, if you’re not careful, two distinct components of your online media strategy can hurt each other. Such is the case for the URLs you use to track your Twitter campaign. Depending on your implementation, this could leave you with an awful paradox: The better your analytics setup, the worse off your site’s SEO might be, or the more airtight your SEO is, the harder it may be to track your social media campaign.

This is currently a giant jigsaw puzzle, and the pieces don’t all fit together precisely yet. I’ll use Twitter as the “social media” example here, although it’s certainly not the only micro-blogging service out there.

Building upon Danny Sullivan’s recent fantastic review, I’ll use Cli.gs as an example. This service is very good for tracking tweeted URLs — as long as you’re the person who tweeted them in the first place. For example, if you have 50 followers, and you wonder what percentage of your followers click through to the links you tweet, it gives you a pretty good idea of that. Watching the stats over time, you can possibly get a glimpse into whether your tweets are being retweeted, how fast your followers get around to reading your tweets, and so on.

But if you have a URL on your site whose “Twitter popularity” you want to measure, Cli.gs (or whatever shortener you use) is only a piece of the puzzle. If someone comes to your site and tweets that URL herself, there’s no guarantee that she’ll use Cli.gs, and even if she does, it will be a Cli.gs URL from her own Cli.gs account, not yours, so you won’t have access to the stats anyway.

A Smelly Example

Let’s look at a sample implementation done by one of my favorite sites, “The Onion.” This site uses traditional Google Analytics tracking codes to when the staff tweets a URL. For example, this article about Stephen Colbert, has the following tracking codes:


So with some certainty, “The Onion” can follow that URL’s sojourn around the Web no matter how it gets retweeted, what URL shortener is used, and so on.

But what happens if someone simply goes to TheOnion.com in her browser and sees this URL:


and decides to tweet that URL? That’s another piece of the puzzle. If she drops that URL into her Twitter client or into the Twitter Web interface, “The Onion” is at the mercy of whatever shortener is the default for that application. Later, if someone clicks the link from (for example) Twitterific or any other Twitter client except for the Web interface, it’s very likely to show up as a visit with no referrer.

So what is “The Onion” left with? Not only does it not know how the user came to its site, the user doesn’t land on a Twitter-based URL for measurement. That results in a big gap in reporting; the visit is lumped in with other “no referrer” visits, the black hole of analytics.

One idea that might help mitigate the issue is to front-load your social media sharing buttons with tracking URLs instead of just the URL the visitors happens to be on. The easier you can make it for visitors to share your preferred version of the URL, the easier it will be to track the success to the granular level you want.

Duplication From Social Media URLs

Configuring its buttons in this way, “The Onion” avoids some of the analytics issues I discussed above. But switching gears from tracking to duplication, I again use “The Onion” for an example. An older article, this time about Detroit being sold for scrap, has the following versions indexed (some URLs at Yahoo and some at Google).





For analytics? Perfect. For SEO, however, it’s far from perfect. Is this level of duplication the price we pay for tracking all the ways we need to share content? Possibly. But let’s examine some ways to keep it from becoming a problem. In theory, you could simply disallow all non-canonical URLs with a few lines in your robots.txt file:

    Disallow: /*/print/

    Disallow: /*?utm

But I recommend against it, because those URLs are probably indexed due to getting passed around and accruing some links. So why throw away that link equity? There’s a smarter way to avoid duplication and still retain the link popularity that the non-canonical URLs are gathering.

You probably know where I’m going with this. This case study from “The Onion” is a perfect real-world example of where the canonical link element would be a perfect fit.

The following line, added to the section of each of The Onion’s URLs discussed above, would allow the site to track social media visits to its heart’s content while ensuring that all link equity is accounted for:


Engines don’t offer solutions to problems that they don’t believe are problems. That’s why duplication created by social media links is worth addressing.

Related reading

Brand Top Level Domains