Site Architecture and SEO

If we were to break SEO (define) into its simplest form, we’d discuss site architecture and navigation, temporary and permanent content, and inbound links — in that order. It makes sense to put first things first, so we start with the way the site is built.

Before we can help create keyword-targeted content that provides users and search engine spiders with relevant themes, we must usually make additional content manageable. This means the CMS (define) must readily allow input of original permanent and temporary content on the page, as well as the entry and editing of unique title tags and meta description tags behind the page.

Before we can direct the search engine spiders toward prominent content, we must have the ability to create a natural hierarchy for page content. This means the site design, particularly the style sheets, must allow for optimal uses of header tags, ideally .

And before we can efficiently direct links into the site, we must efficiently funnel internal linkage throughout it. Since the internal linking structure must be readily crawlable, site design must come under full review and an assessment completed.

Inhibitors and Disrupters

Generally speaking, site structure frequently works to inhibit or disrupt the natural flow of internal linkage through the site. Different types of coding and programming require different remedies to overcome navigational flaws.

No amount of alternative tags in images will overcome the fact that spiders can’t recognize images, for example. If site navigation is completely image-based, we must render the images as text-based links via CSS (define). This is a simple way to maintain the site’s appearance while making the navigation search optimal.

When Flash navigation is used, further steps must be taken to swap out the invisible structure of unembedded links with a crawlable structure. When a site’s designed entirely in Flash, we can focus on building a low-resolution solution for spiders to crawl and screen readers to see or build some contextual strength into the site with the Macromedia Flash SDK (define). Either way, we still need to link different site elements together in a manner that can be readily crawled and indexed by search engine spiders.

When AJAX (define) is used to design a site, URL creation must first be overcome. Though the user experience is no doubt exceptional on an AJAX-intensive site, the crawling experience is surely disrupted. Creating static pages on the fly is a good place to start, but these pages must be linked to each other to form a crawlable site structure. Footers and sitemaps can complement site crawling, but they are by no means an optimal solution for building greater site relevancy.

Although the major search engines can handle some dynamic parameters in URLs without choking on them, session IDs in URLs remain a spider killer. To overcome disruptive use of session IDs, URL rewrites are usually in order. But URL rewrites must commonly be accompanied by a database of permanent redirects and efficient use of the robots.txt file to keep the spiders crawling a site on a regular basis.

Appended URLs with tacking parameters create additional disruption to the flow of a site’s navigation, as well as duplicate content that further inhibits building relevant themes. Rewrites and redirects can help here, too, but we might want to rethink how we track users through a site if the system inhibits a search optimal presentation of the site.

Tools to Use

Spider simulators:

For indexing audits, use “site:” command strings in Google and MSN, such as, or Site Explorer in Yahoo.

Remember, numbers provided by the search engines are an estimated number of pages indexed. It still takes a drill-down to determine if your site suffers from any of the crawling inhibitors or disruptors discussed here today.

In Summary

What appear to be additional layers of complexity within a site structure actually make the site simpler for the spiders for crawl. Since crawling is our first hurdle toward successful indexing, it only makes sense to be certain our sites can be efficiently crawled by search engine spiders.

By playing close attention to how well search engine spiders crawl our sites, we can take the first steps toward improving the indexing of our sites. Once we understand how well the site is indexed, we can take move toward improving site relevancy on a theme-by-theme, page-by-page basis.

Want more search information? ClickZ SEM Archives contain all our search columns, organized by topic.

Related reading

Brand Top Level Domains