The Anatomy of a Crawler Friendly Web Page
Mike Grehan | April 6, 2009
A reference guide to optimizing your Web pages for search engines.
On-page optimization may not have the rank-boosting power it had in the '90s. But it's the bedrock of a solid SEO (define) campaign.
Many companies use a content management system (CMS) to deliver content to their Web site. And many systems are inherently search engine crawler unfriendly. So I asked the team of experts I work with to help me create a reference guide to the anatomy of a crawler friendly Web page.
What Should Go in the Head of the Document?
- The title is the most important tag. Use it. Make it unique to each page.
- Meta description is the second most important heading tag. Use it. Make it unique to each page. This plus the title alone could keep pages from being considered duplicate content. The meta description probably won't improve your rankings, but it could improve CTRs (define).
- Specify the document type (first line in the page). You probably want the following, but your developer will know for sure: .
- Drop in a content type tag. Put it before the title tag. You probably want .
- Every other tag is of little or no value. Author, category, and other such tags inserted by some CMS systems are ignored.
What Should Go in the Body of the Document?
- Text as text. Use Flash and AJAX (define) for non-text elements and/or interactive elements when CSS (define) won't suffice.
- Use images related to text on the page and not just filler. Alt tags, file names, and a caption will improve the chances of being included in image results. Use a high-resolution file or link to one. For alt tags, otherwise known as an alt (alternative) attribute, place a text description of an image so that visually impaired surfers with text readers know what the content is. Search engines can also pick up this text and the keywords in it.
- Avoid nested tables that can result in content "appearing" differently to search engines than what you expect. Getting rid of tables makes for a lighter page, too.
- Include related links to other articles on your site. Having them in the body will help keep them from being classified as just navigation and devalued.
- Use images for a lengthy boilerplate or disclaimers.
What Order Should HTML Tags Be In?
- Start your content with a heading, H1.
- Use just one H1, and then H2 or H3 as needed. Maintain a proper hierarchy.
- Don't use heading tags in your masthead or navigation. Keep them for the main content.
- All content should be in between the body tags. Everything else might be ignored.
- Watch out for tags that aren't closed. This could inadvertently hide content from search engines.
Should You Use Country/Language Tags?
- Country and language tags aren't necessary. Search engines do a decent job, in particular, with language.
- Want to give Google some more info? Set the geo-targeting option in Webmaster Tools.
Is a Canonical Tag Necessary?
- Make a canonical tag part of the template so it's set automatically. (A canonical tag lets search engines know which is the primary version of a page, as in the version with www as opposed to the version without www in front. This helps to avoid indexing duplicate content.) Then you'll have less to worry about with other teams using tracking parameters on banners and other ads creating duplicated pages.
- This is particularly useful if you've got an affiliate program. It helps consolidate all of the inbound links.
What About Page Weight/Size?
- Page size for users and page size for search engines are different. Search engines focus on code and content. They'll eventually grab everything as is evidenced by large PDFs being indexed.
- A slow-loading movie likely won't impact search engines, but it may cause users to bounce if it's too slow. Don't send the wrong signal to search engines by having your user's bounce and do another search.
- Search engines will crawl more than 100 links on a page, but then you've probably got a usability issue. Categorize the links and create new pages. Help your users zero in on what they're interested in.
- Stuffing your footer with links isn't as effective as it once was.
How Much Text?
- A good target is 250 to 300 words. More is fine especially from a user's perspective.
- Split articles if there are distinct topic areas to enable targeting multiple keywords. Be aware that clicking annoys users, so find a good balance.
- Use distinct URLs for each page within a series. For instance, don't show and hide different pages using CSS.
- Link to each page in a series using keyword rich links, not Page 1, 2, 3, or next and previous.
What About CSS Formatting?
- CSS allows you to position text and images on a page for the visitor to see in any order you wish. So, even if the code -- which the visitor can't see -- has an item at the bottom of the page, it can still appear at the top on the visible page.
- If you can separate CSS from the HTML markup, you'll have an easier time with maintenance.
- Avoid tricks that involve the use of negative placement even if you're not trying to trick the search engines.
- Be aware that if you "hide" content such as a navigation menu, search engines will still see the content and follow the links.
Of course, this isn't an exhaustive list. You may have tracking code for your analytics package in the body and other additions. But if you're new to the game, it should help you get off the ground.