How to Make Robots Cry With Faceted Navigation

Implemented carefully, exposing many facets to robots will create high-quality landing pages for long-tail searches.

Usability engineers have spoken at length about faceted navigation. Humans do not enjoy navigating through many levels of intermediate pages on category-based websites – where, at any point they may click the wrong thing and reach a dead end. Marketers enjoy this as much as they enjoy a bad case of the flu, and a faceted navigation scheme may just be the panacea.

Unfortunately, much less has been said about the search marketing and optimization problems that arise with this form of navigation.

Faceted navigation is an almost-universally positive experience for humans. Two of the most important benefits are:

  1. Facets permit users to combine selections to zero in on results.
  2. Facets permit users to make those selections in any order.

Many of the big box stores have already implemented it. This includes well-known brands such as Amazon, Home Depot, and B&H. Newegg.com may also be remembered as one of the earliest to take the plunge. Just as the so-called “Web directories” have all but evaporated from the thread of the Web, category-only navigation might soon do the same. Does anyone actually use a directory in real life? And not just for the links!

It turns out that many decision-making processes are not facilitated by a rigid category-only navigation scheme. A category tree assumes an exact process in an exact order. It can only be walked in that order – for example, “Home > Televisions > Flat Screen > LCD > 50.” The user might select different options such as LED instead of LCD, but the path for each decision would be the same.

That looks like it makes sense on the surface. But perhaps size matters more than underlying technology for some users – 60″-plus or no deal. The user might not care whether the television is LCD or LED – or DLP. And here, faceted navigation rises to the task. The combination of a shallow category tree for basic decisions – with a series of facets – may be the best approximation of a talented salesman discovered yet.

Humans don’t like to conform this way – we like choice. The problem is that while humans exhibit intelligence and intent, we only click combinations relevant to us; robots do not. A robot has nothing better to do, and often no other choice, than visit most everything. That’s a problem when a robot can combine selections and do so in any order. Do the math. Technologies that are often great for users like intra-facet multiple value selection only make the numbers far worse.

The result: a massive spider trap on any website with over a few thousand products.

Even a seasoned search marketer might immediately begin by throwing new technologies such as rel=”canonical” at the problem, thinking they’re the latest and greatest. Or she may surrender and conclude there is simply no way except to exclude the entire trap. However, this is not the case. Rather, as is the case much of the time, throwing technology at a problem – or making drastic decisions – often doesn’t lead to the best answer.

One option is a creative application of traditional robots.txt-based exclusion. Exclusion makes sense because the content generated by facet pages is only similar, not duplicate. We have many thousands of combinations of unique, but undesirable pages.

One might generate undesirable URLs with a prefix such as “/noindex/,” but not do so for the desirable ones. For example: http://www.example.com/noindex/home/televisions/flat-screen/sony/52/LED/ would not be spidered, but: http://www.example.com/home/televisions/flat-screen/sony/ would be. This works because the robot knows a priori – based on the URL – which pages are useful and which are not. The method to determine which links should be built which way is not the concern of the robot, but one reasonable simple way is not to allow more than a few selections. This greatly reduces the size of the spider trap.

Note that meta-exclusion, though similar in purpose, is almost certainly the wrong decision – and for this same reason, canonicalization cannot work as well. Since it is on-page, a bot must crawl that same spider trap of offending pages to know about the exclusions. In fact, any on-page methodology is well-suited only for small-to-medium quantities of content to be excluded (or canonicalized).

Implemented with these concerns in mind, a website with a faceted navigation scheme can help to create a set of legitimate landing pages that may complement subcategory pages – for users as well as search engines and PPC campaigns. In fact, one could probably debate whether some facets are subcategories or vice versa. If this is true, it stands to reason that excluding everything wholesale cannot be optimal.

Implemented carefully, exposing many facets to robots will create high-quality landing pages for long-tail searches. Without care, however, faceted navigation will weave a giant spider trap with seemingly-infinite URL permutations of the same products. In fact, some faceted search implementations – B&H included – do “address” this concern by excluding their faceted navigation pages from search engines entirely.

Unfortunately, this is not a solution, and anything but optimal – as one might expect after observing the overlapping functionality and purpose of subcategories and facets. There may be no one universal solution, but the above approach may keep a robot from shedding blood, sweat, and tears while spidering. A best solution is elusive, and I’m open to new ideas and suggestions.

This column was originally published in SES Magazine, March 2011.

Subscribe to get your daily business insights

Whitepapers

US Mobile Streaming Behavior
Whitepaper | Mobile

US Mobile Streaming Behavior

5y

US Mobile Streaming Behavior

Streaming has become a staple of US media-viewing habits. Streaming video, however, still comes with a variety of pesky frustrations that viewers are ...

View resource
Winning the Data Game: Digital Analytics Tactics for Media Groups
Whitepaper | Analyzing Customer Data

Winning the Data Game: Digital Analytics Tactics for Media Groups

5y

Winning the Data Game: Digital Analytics Tactics f...

Data is the lifeblood of so many companies today. You need more of it, all of which at higher quality, and all the meanwhile being compliant with data...

View resource
Learning to win the talent war: how digital marketing can develop its people
Whitepaper | Digital Marketing

Learning to win the talent war: how digital marketing can develop its peopl...

2y

Learning to win the talent war: how digital market...

This report documents the findings of a Fireside chat held by ClickZ in the first quarter of 2022. It provides expert insight on how companies can ret...

View resource
Engagement To Empowerment - Winning in Today's Experience Economy
Report | Digital Transformation

Engagement To Empowerment - Winning in Today's Experience Economy

1m

Engagement To Empowerment - Winning in Today's Exp...

Customers decide fast, influenced by only 2.5 touchpoints – globally! Make sure your brand shines in those critical moments. Read More...

View resource