How to Make Robots Cry With Faceted Navigation

Implemented carefully, exposing many facets to robots will create high-quality landing pages for long-tail searches.

Author

Jaimie Sirovich

Date published March 23, 2011 Categories

Usability engineers have spoken at length about faceted navigation. Humans do not enjoy navigating through many levels of intermediate pages on category-based websites – where, at any point they may click the wrong thing and reach a dead end. Marketers enjoy this as much as they enjoy a bad case of the flu, and a faceted navigation scheme may just be the panacea.

Unfortunately, much less has been said about the search marketing and optimization problems that arise with this form of navigation.

Faceted navigation is an almost-universally positive experience for humans. Two of the most important benefits are:

Facets permit users to combine selections to zero in on results.
Facets permit users to make those selections in any order.

Many of the big box stores have already implemented it. This includes well-known brands such as Amazon, Home Depot, and B&H. Newegg.com may also be remembered as one of the earliest to take the plunge. Just as the so-called “Web directories” have all but evaporated from the thread of the Web, category-only navigation might soon do the same. Does anyone actually use a directory in real life? And not just for the links!

It turns out that many decision-making processes are not facilitated by a rigid category-only navigation scheme. A category tree assumes an exact process in an exact order. It can only be walked in that order – for example, “Home > Televisions > Flat Screen > LCD > 50.” The user might select different options such as LED instead of LCD, but the path for each decision would be the same.

That looks like it makes sense on the surface. But perhaps size matters more than underlying technology for some users – 60″-plus or no deal. The user might not care whether the television is LCD or LED – or DLP. And here, faceted navigation rises to the task. The combination of a shallow category tree for basic decisions – with a series of facets – may be the best approximation of a talented salesman discovered yet.

Humans don’t like to conform this way – we like choice. The problem is that while humans exhibit intelligence and intent, we only click combinations relevant to us; robots do not. A robot has nothing better to do, and often no other choice, than visit most everything. That’s a problem when a robot can combine selections and do so in any order. Do the math. Technologies that are often great for users like intra-facet multiple value selection only make the numbers far worse.

The result: a massive spider trap on any website with over a few thousand products.

Even a seasoned search marketer might immediately begin by throwing new technologies such as rel=”canonical” at the problem, thinking they’re the latest and greatest. Or she may surrender and conclude there is simply no way except to exclude the entire trap. However, this is not the case. Rather, as is the case much of the time, throwing technology at a problem – or making drastic decisions – often doesn’t lead to the best answer.

One option is a creative application of traditional robots.txt-based exclusion. Exclusion makes sense because the content generated by facet pages is only similar, not duplicate. We have many thousands of combinations of unique, but undesirable pages.

One might generate undesirable URLs with a prefix such as “/noindex/,” but not do so for the desirable ones. For example: http://www.example.com/noindex/home/televisions/flat-screen/sony/52/LED/ would not be spidered, but: http://www.example.com/home/televisions/flat-screen/sony/ would be. This works because the robot knows a priori – based on the URL – which pages are useful and which are not. The method to determine which links should be built which way is not the concern of the robot, but one reasonable simple way is not to allow more than a few selections. This greatly reduces the size of the spider trap.

Note that meta-exclusion, though similar in purpose, is almost certainly the wrong decision – and for this same reason, canonicalization cannot work as well. Since it is on-page, a bot must crawl that same spider trap of offending pages to know about the exclusions. In fact, any on-page methodology is well-suited only for small-to-medium quantities of content to be excluded (or canonicalized).

Implemented with these concerns in mind, a website with a faceted navigation scheme can help to create a set of legitimate landing pages that may complement subcategory pages – for users as well as search engines and PPC campaigns. In fact, one could probably debate whether some facets are subcategories or vice versa. If this is true, it stands to reason that excluding everything wholesale cannot be optimal.

Implemented carefully, exposing many facets to robots will create high-quality landing pages for long-tail searches. Without care, however, faceted navigation will weave a giant spider trap with seemingly-infinite URL permutations of the same products. In fact, some faceted search implementations – B&H included – do “address” this concern by excluding their faceted navigation pages from search engines entirely.

Unfortunately, this is not a solution, and anything but optimal – as one might expect after observing the overlapping functionality and purpose of subcategories and facets. There may be no one universal solution, but the above approach may keep a robot from shedding blood, sweat, and tears while spidering. A best solution is elusive, and I’m open to new ideas and suggestions.

This column was originally published in SES Magazine, March 2011.

Subscribe to get your daily business insights

More about:

Read the next article

Explore Tech Talks

Lucy

Lucy helps organizations leverage knowledge for in... View Tech Talk
TVSquared

TVSquared is the global leader in cross-platform T... View Tech Talk
Grata

Grata is a B2B search engine for discovering small... View Tech Talk

Whitepapers

US Mobile Streaming Behavior

Whitepaper | Mobile

11y Melissa Mackey

Are You Following PPC Best Practices?

A checklist for successful PPC campaigns. Read More

View article

Follow us

How to Make Robots Cry With Faceted Navigation

Subscribe to get your daily business insights

Read the next article

Explore Tech Talks

Whitepapers

Whitepapers

US Mobile Streaming Behavior

US Mobile Streaming Behavior

Winning the Data Game: Digital Analytics Tactics for Media Groups

Winning the Data Game: Digital Analytics Tactics f...

Learning to win the talent war: how digital marketing can develop its peopl...

Learning to win the talent war: how digital market...

Engagement To Empowerment - Winning in Today's Experience Economy

Engagement To Empowerment - Winning in Today's Exp...

Related Articles

5 PPC Tips from ClickZ Live Toronto

5 PPC Tips from ClickZ Live Toronto

Uncovering PPC Best Practices With Allstate Insurance

Uncovering PPC Best Practices With Allstate Insura...

[#SESDENVER] Is LinkedIn a Solution for Brands Tapped Out on AdWords?

[#SESDENVER] Is LinkedIn a Solution for Brands Tap...

Mastering the PPC Challenge: Break-Even CPC and More [#CZLSF]

Mastering the PPC Challenge: Break-Even CPC and Mo...

#CZLNY: 5 Spying Strategies You Can Use Today to Dominate Your Competition

#CZLNY: 5 Spying Strategies You Can Use Today to D...

#CZLNY: Smarter Ad Copy Testing: What Makes People Click?

#CZLNY: Smarter Ad Copy Testing: What Makes People...

Retail PPC: 3 Key Considerations When Choosing a Supplier for Automating Pr...

Retail PPC: 3 Key Considerations When Choosing a S...

Are You Following PPC Best Practices?

Are You Following PPC Best Practices?