Understanding the New Canonical Link Element

Google, Yahoo, and MSN/Live announce joint support for a new link element. Learn when, why, and how you should use it.

English majors are familiar with the term “literary canon,” which describes a group of works accepted as particularly influential within a particular genre or theme. Search engineers, similarly devoted to elegant syntax in their own discipline, use “canon” to describe the “main” or “correct” URL to use when faced with multiple variants of that URL.

The big news coming out of SMX West last week was that the three major search engines (Google, Yahoo, and MSN/Live) announced joint support for a new link element that enables engines to reduce their guesswork when it comes to identifying your site’s canonical URLs. Today’s column runs through the basics of the element: what it is, and when, why, and how you should use it.

What Is the Canonical Element?

In short, the canonical element is a line of code that you add to pages that may be duplicates. In this code, you designate the “canonical,” or “proper,” URL. Engines, in turn, note this URL and apply link popularity and authority to the canonical version instead of applying them to duplicate URLs. In theory, this will consolidate the authority and link popularity into a single URL, as opposed to splintering them among several similar URLs.

The basic message you’re sending to engines, as put by Google in its support documents, is that “of all these pages with identical content, this page is the most useful. Please prioritize it in search results.”

When Should You Use the Canonical Element?

Over the last couple years, I’ve discussed several different types of duplicate content. Following is a sample list of different types of duplicate content. Each of these (with the possible exception of the “pagination” usage) is a suitable candidate for the new rel=”canonical” element:

www vs. non-www. Go to any URL on your site. Does it work either with or without the “www” subdomain affixed to the URL? If so, this applies to you.

Secure vs. unsecure. Similarly, when a URL works with either http or https in the address field, it’s a duplicate.

Affiliate/vendor tracking. Some sites have affiliate relationships that use tracking codes in URLs. For example, you might want your affiliates to drive traffic to your /products/ page, but if your site assigns a unique code to each affiliate, you may end up receiving traffic at the following pages, all of which show the same content:

  • http://www.yoursite.com/products/?affiliate=001
  • http://www.yoursite.com/products/?affiliate=127
  • http://www.yoursite.com/products/?affiliate=663

Load balancing. Some sites with heavy traffic balance server loads by diverting traffic to servers such as www2, www3, and so on, which often leads to heavy indexing of non-www variants.

/ vs. /default.aspx. This is similar to www and non-www. Many platforms resolve a page at both the root level (or at the folder level) as well as at an associated filename.

Navigation-based tracking. The default setup of some platforms can show the same URL in several different formats, based upon the internal route used to get to the page. For example, you might have a page called http://www.yoursite.com/products/, but if you navigate to that page from the side navigation bar, your content management system (CMS) might produce a URL such as http://www.yoursite.com/products/?from=sidenav.

Pagination. Usage of the canonical element on paginated articles (those broken into multiple chunks, such as /article.aspx, /article.aspx?p=2, /article.aspx?p=3, and so on) isn’t for everyone and should be carefully considered case by case. But if you prefer that users enter your articles only at the first page and you don’t particularly mind if the second or third pages of your articles don’t rank at engines, consider the canonical element for this purpose.

How Do You Use the Canonical Tag?

The syntax of the new canonical attribute is quite simple. Between the <head> and </head> tags on your page (the same place you put your title element and meta data), place this line:

<link rel=”canonical” href=”http://www.yoursite.com /correct-url/” />

The actual URL in the line above, of course, is a placeholder for the actual canonical URL that you’ll put in your file.

As an example, let’s assume you have duplicate content issues that result in all of the following URLs resolving correctly, and that the first URL in this list is the “correct” one:

  • http://www.yoursite.com/correct-url/
  • http://www.yoursite.com/correct-url/default.aspx
  • http://yoursite.com/correct-url/
  • https://www.yoursite.com/correct-url/
  • http://www.yoursite.com/correct-url/?aff_src=127
  • http://www.yoursite.com/correct-url/?from=sidenav
  • and so on.

To apply the canonical element, just add the following line to the <head> section of all pages:

<link rel=”canonical” href=”http://www.yoursite.com/correct-url/” />

You may have these related questions:

How Do I Isolate All the Duplicate Variants of My Canonical URL in the First Place?

In theory, you shouldn’t need to know how many duplicate variants you have before implementing this element. If you automate this process, just ensure (and make sure you test and confirm) that your CMS is hard-coding the canonical URL into the <head> section. Any duplicate variants of that URL should still contain the correct canonical URL inside.

How Will This Change the User Experience?

It won’t affect users at all. This element is completely invisible to the user, regardless of browser, regardless of script-enabled status, regardless of everything. There is no redirection, as far as the user is concerned. Any comparisons to a 301 redirect that you read about in relation to this element are strictly symbolic.

In addition, this element will not affect analytics. All page views, page landings, and user actions will still be counted as they normally have been.

Conclusion

Historically, when the big three engines gang up and announce a coordinated effort, it’s been wise to pay attention, because the results can benefit sites of all sizes. The Sitemaps.org project is one example. The agreement on advanced Robots Exclusion Protocol is another. Plenty of people will be watching this development very closely to see how it acts in the real world (and building tools to help implement it), so stay tuned for further information.

Subscribe to get your daily business insights

Whitepapers

US Mobile Streaming Behavior
Whitepaper | Mobile

US Mobile Streaming Behavior

5y

US Mobile Streaming Behavior

Streaming has become a staple of US media-viewing habits. Streaming video, however, still comes with a variety of pesky frustrations that viewers are ...

View resource
Winning the Data Game: Digital Analytics Tactics for Media Groups
Whitepaper | Analyzing Customer Data

Winning the Data Game: Digital Analytics Tactics for Media Groups

5y

Winning the Data Game: Digital Analytics Tactics f...

Data is the lifeblood of so many companies today. You need more of it, all of which at higher quality, and all the meanwhile being compliant with data...

View resource
Learning to win the talent war: how digital marketing can develop its people
Whitepaper | Digital Marketing

Learning to win the talent war: how digital marketing can develop its peopl...

2y

Learning to win the talent war: how digital market...

This report documents the findings of a Fireside chat held by ClickZ in the first quarter of 2022. It provides expert insight on how companies can ret...

View resource
Engagement To Empowerment - Winning in Today's Experience Economy
Report | Digital Transformation

Engagement To Empowerment - Winning in Today's Experience Economy

1m

Engagement To Empowerment - Winning in Today's Exp...

Customers decide fast, influenced by only 2.5 touchpoints – globally! Make sure your brand shines in those critical moments. Read More...

View resource