How to Make Robots Cry With Faceted Navigation

  |  March 23, 2011   |  Comments

Implemented carefully, exposing many facets to robots will create high-quality landing pages for long-tail searches.

Usability engineers have spoken at length about faceted navigation. Humans do not enjoy navigating through many levels of intermediate pages on category-based websites - where, at any point they may click the wrong thing and reach a dead end. Marketers enjoy this as much as they enjoy a bad case of the flu, and a faceted navigation scheme may just be the panacea.

Unfortunately, much less has been said about the search marketing and optimization problems that arise with this form of navigation.

Faceted navigation is an almost-universally positive experience for humans. Two of the most important benefits are:

  1. Facets permit users to combine selections to zero in on results.
  2. Facets permit users to make those selections in any order.

Many of the big box stores have already implemented it. This includes well-known brands such as Amazon, Home Depot, and B&H. may also be remembered as one of the earliest to take the plunge. Just as the so-called "Web directories" have all but evaporated from the thread of the Web, category-only navigation might soon do the same. Does anyone actually use a directory in real life? And not just for the links!

It turns out that many decision-making processes are not facilitated by a rigid category-only navigation scheme. A category tree assumes an exact process in an exact order. It can only be walked in that order - for example, "Home > Televisions > Flat Screen > LCD > 50." The user might select different options such as LED instead of LCD, but the path for each decision would be the same.

That looks like it makes sense on the surface. But perhaps size matters more than underlying technology for some users - 60"-plus or no deal. The user might not care whether the television is LCD or LED - or DLP. And here, faceted navigation rises to the task. The combination of a shallow category tree for basic decisions - with a series of facets - may be the best approximation of a talented salesman discovered yet.

Humans don't like to conform this way - we like choice. The problem is that while humans exhibit intelligence and intent, we only click combinations relevant to us; robots do not. A robot has nothing better to do, and often no other choice, than visit most everything. That's a problem when a robot can combine selections and do so in any order. Do the math. Technologies that are often great for users like intra-facet multiple value selection only make the numbers far worse.

The result: a massive spider trap on any website with over a few thousand products.

Even a seasoned search marketer might immediately begin by throwing new technologies such as rel="canonical" at the problem, thinking they're the latest and greatest. Or she may surrender and conclude there is simply no way except to exclude the entire trap. However, this is not the case. Rather, as is the case much of the time, throwing technology at a problem - or making drastic decisions - often doesn't lead to the best answer.

One option is a creative application of traditional robots.txt-based exclusion. Exclusion makes sense because the content generated by facet pages is only similar, not duplicate. We have many thousands of combinations of unique, but undesirable pages.

One might generate undesirable URLs with a prefix such as "/noindex/," but not do so for the desirable ones. For example: would not be spidered, but: would be. This works because the robot knows a priori - based on the URL - which pages are useful and which are not. The method to determine which links should be built which way is not the concern of the robot, but one reasonable simple way is not to allow more than a few selections. This greatly reduces the size of the spider trap.

Note that meta-exclusion, though similar in purpose, is almost certainly the wrong decision - and for this same reason, canonicalization cannot work as well. Since it is on-page, a bot must crawl that same spider trap of offending pages to know about the exclusions. In fact, any on-page methodology is well-suited only for small-to-medium quantities of content to be excluded (or canonicalized).

Implemented with these concerns in mind, a website with a faceted navigation scheme can help to create a set of legitimate landing pages that may complement subcategory pages - for users as well as search engines and PPC campaigns. In fact, one could probably debate whether some facets are subcategories or vice versa. If this is true, it stands to reason that excluding everything wholesale cannot be optimal.

Implemented carefully, exposing many facets to robots will create high-quality landing pages for long-tail searches. Without care, however, faceted navigation will weave a giant spider trap with seemingly-infinite URL permutations of the same products. In fact, some faceted search implementations - B&H included - do "address" this concern by excluding their faceted navigation pages from search engines entirely.

Unfortunately, this is not a solution, and anything but optimal - as one might expect after observing the overlapping functionality and purpose of subcategories and facets. There may be no one universal solution, but the above approach may keep a robot from shedding blood, sweat, and tears while spidering. A best solution is elusive, and I'm open to new ideas and suggestions.

This column was originally published in SES Magazine, March 2011.


ClickZ Live New York Want to learn more?
Attend ClickZ Live New York March 30 - April 1. With over 15 years' experience delivering industry-leading events, ClickZ Live brings together over 60 expert speakers to offer an action-packed, educationally-focused agenda covering all aspects of digital marketing. Register today!


Jaimie Sirovich

Jaimie Sirovich is a search marketing consultant. Officially he is a computer programmer, but he claims to enjoy marketing much more. At present, Jaimie is focused on helping clients sell everywhere, and achieve multi-channel integration with major websites such as eBay, Amazon, and even Craigslist. He is the author of Search Engine Optimization With PHP.

COMMENTSCommenting policy

comments powered by Disqus

Get the ClickZ Marketing newsletter delivered to you. Subscribe today!




Featured White Papers

A Buyer's Guide to Affiliate Management Software

A Buyer's Guide to Affiliate Management Software
Manage your performance marketing with the right solution. Choose a platform that will mutually empower advertisers and media partners!

Google My Business Listings Demystified

Google My Business Listings Demystified
To help brands control how they appear online, Google has developed a new offering: Google My Business Locations. This whitepaper helps marketers understand how to use this powerful new tool.



    • Website Optimizer - SEO, CRO, Analytics
      Website Optimizer - SEO, CRO, Analytics (Marcel Digital) - ChicagoMarcel Digital, an award winning interactive marketing agency established in 2003...
    • Director of Marketing
      Director of Marketing (Patron Technology) - New YorkDirector of Marketing We are seeking a Director of Marketing to manage and build our marketing...
    • Senior Interactive Producer
      Senior Interactive Producer (Ready Set Rocket) - New YorkWhat You'll Do As a member of our team, the Senior Producer reports directly to our...