Canonicalization Made Simple

  |  February 14, 2007   |  Comments

The road to short and sweet search URLs.

Technically speaking, canonicalization is "the process of converting data that has more than one possible representation into a 'standardized' canonical representation."

Search engine algorithms include a mathematical equation that compares different representations for similarity, counting the number of distinct data structures, to impose a meaningful, canonical sorting order.

That makes sense... right? Maybe for software engineers, computer programmers, math majors, and the like. But let's make this a bit simpler.

Plainly speaking, search engines like Google use a canonicalization process to present users with short and sweet URLs. Think about this for a moment and consider which URL the average user would most likely click on when presented with these choices:




If you believe Google's canonical preference would be, even when all three URLs arrive at the same destination, you can proudly say you understand the fundamentals of canonicalization.

Let's take a look at the major search engines' canonical preferences more closely to try to determine what other factors go into determining which URL is presented in search query results.

For the sake of discussion, let's complete a search for "milwaukee brewers" in Google, Yahoo, and MSN to compare the results.

Google offers the following top results:

The Official Site of The Milwaukee Brewers: Homepage
Features scores, game schedules, roster, news, history and forums. - 78k - Cached - Similar pages
Schedule : 2007 Brewers Schedule -
Active Roster -
Ticket Center -
Help : Job Opportunities -
More results from »

Yahoo offers the following top result:

Milwaukee Brewers
Official site of the Milwaukee Brewers. Features up-to-date stats and results, player bios, minor league information, ticket and merchandise ordering info, player ...
Category: Major League Baseball > Milwaukee Brewers
www. - 79k - Cached - More from this site

And MSN Live Search offers the following top results:

Milwaukee Brewers : The Official Site
MLB Sites Angels Astros Athletics Blue Jays Braves Brewers Cardinals Cubs Devil Rays Diamondbacks Dodgers Giants Indians Mariners Marlins Mets Nationals Orioles Padres Phillies Pirates Rangers ...

Note that no one top result is more relevant than the other. All indexed listings resolve to by way of a temporary redirect (302).

Why, then, is one domain displayed in Google and MSN and another in Yahoo for the same result? Are the Milwaukee Brewers spoofing the search engines using temporary redirects and multiple domains?

Not exactly. Canonicalization processes simply level the playing field. These algorithmic elements vary from search engine to search engine.

Google knows the two domains are exactly the same and treats them as such when it comes to inbound links. Using query string commands, Google reveals it acknowledges 2,200 links to both and

A lot of SEO (define) folks have talked about Google's preference for subdomains. This is proof of that preference because that's how the site's actually crawled and indexed. Do a query for "" and you'll get some 7,880 pages. Do the same for "," and you'll get "did not match any documents."

To provide users with its preferred results, Google relegates to its no man's land of non-indexation. Google canonically prefers to display the pretty little subdomain,, as its most relevant result for a "milwaukee brewers" search query.

MSN Live Search just isn't as bright when it comes to algorithmic adjustments. It indexes nearly 1,300 pages of "" and six pages of "". Its algorithms credit "" with nearly 14,000 inbound links and "" with over 14,000. MSN Live Search duplicates its own results by including the non-canonical URL in the results.

Getting any bright ideas about MSN Live Search, subdomains, and temporary redirects? Small wonder MSN Live Search has its filters set to "high" to stop spamming itself and present any semblance of canonicalization.

The question that remains is Yahoo's preference over its subdomain counterpart, Based on information from Yahoo Site Explorer, has 735 pages indexed and 228 inbound links. Meanwhile, has 45 pages indexed and 6,331 inbound links.

Should Webmasters redesign their sites to include subdomains if they want to make headway in Google and MSN Live Search? Absolutely not. Subdomains are not a secret weapon for improved indexation.

Subdomains do make sense, however, when each subsection of a top-level domain contains completely unique content addressing different topics, such as the collection of baseball teams at

It would be interesting to test the best way to shift canonicalization processes in the major search engines. Would submitting the top-level domain as the preferred result influence Google and MSN Live Search indexation? Could XML sitemap feeds encourage Yahoo to present the subdomain in natural search results? These are questions for another day while we see if will play ball.

Join us for Search Engine Strategies in London, February 13-15, at ExCel London.

Want more search information? ClickZ SEM Archives contain all our search columns, organized by topic.

ClickZ Live New York Want to learn more?
Attend ClickZ Live New York March 30 - April 1. With over 15 years' experience delivering industry-leading events, ClickZ Live brings together over 60 expert speakers to offer an action-packed, educationally-focused agenda covering all aspects of digital marketing. Register today!


P.J. Fusco

P.J. Fusco has been working in the Internet industry since 1996 when she developed her first SEM service while acting as general manager for a regional ISP. She was the SEO manager for Jupitermedia and has performed as the SEM manager for an international health and beauty dot-com corporation generating more than $1 billion a year in e-commerce sales. Today, she is director for natural search for Netconcepts, a cutting-edge SEO firm with offices in Madison, WI, and Auckland, New Zealand.

COMMENTSCommenting policy

comments powered by Disqus

Get the ClickZ Search newsletter delivered to you. Subscribe today!




Featured White Papers

A Buyer's Guide to Affiliate Management Software

A Buyer's Guide to Affiliate Management Software
Manage your performance marketing with the right solution. Choose a platform that will mutually empower advertisers and media partners!

Google My Business Listings Demystified

Google My Business Listings Demystified
To help brands control how they appear online, Google has developed a new offering: Google My Business Locations. This whitepaper helps marketers understand how to use this powerful new tool.


    • Website Optimizer - SEO, CRO, Analytics
      Website Optimizer - SEO, CRO, Analytics (Marcel Digital) - ChicagoMarcel Digital, an award winning interactive marketing agency established in 2003...
    • Director of Marketing
      Director of Marketing (Patron Technology) - New YorkDirector of Marketing We are seeking a Director of Marketing to manage and build our marketing...
    • Senior Interactive Producer
      Senior Interactive Producer (Ready Set Rocket) - New YorkWhat You'll Do As a member of our team, the Senior Producer reports directly to our...