Examining the "Diagnostics" reports, from crawl stats to HTML suggestions. Last in a series.
This is the final article in a series covering Google Webmaster Tools (GWT) reports and how to interpret them.
Crawl stats reporting shows you the issues encountered by Googlebot as it tries to make its way through your site. It contains the following sub-reports:
Pages crawled per day: This report shows you, over the last three months, how many pages Googlebot has requested day by day. There's no right number in this chart, but zero is certainly the wrong number. Because Google's crawl frequency is due in large part to PageRank of the pages on your site, you'll likely see the frequency increase as large spikes in this report are often due to the introduction of many new pages and/or a large influx of inbound links, possibly due to offline promotions or some other traffic generator.
Kilobytes crawled per day: This graph won't match the "Pages crawled per day" graph exactly, but it should exhibit similarities, such as the same rough peaks and valleys.
Time spent downloading a page (in milliseconds): This graph shows you the average time it takes for Googlebot to pull down a specific URL from your server. Typically, the peaks and valleys of this graph have no relation to the preceding two graphs discussed above. In fact, a peak on this graph often indicates a server problem, because unless your pages are monstrously large, Googlebot should not take very long to download any of them.
PageRank of your pages in Google: This bar chart takes all of your indexed pages and gives you the proportions that fall into one of the following four PageRank categories: high, medium, low, and not yet assigned.
Of all the GWT reports to take with a grain of salt, this is one of the tops. Pages need a PageRank of at least a 7 or 8 to fall into the "High" category, so don't expect too many of yours to be there. For the vast majority of sites, most web pages will fall into the "Low" or "Not yet assigned" categories, and this by itself is usually nothing to worry or even care about, because if you're in the game for the PageRank, you're missing some critical strategy.
About the only practical use for this report is to find that six or more months have passed between the addition of large amounts of content and them still sitting in the "Not yet assigned" bucket. This could indicate some crawling or indexing obstacles or the need for additional links pointing into deep content.
Your page with the highest PageRank. First of all, this report has had a history of buggy behavior. Its goal, as the name suggests, is to tell you which of the pages on your site has the highest PageRank. In 99 percent of all sites, that will be the home page, as you might expect. If you see an odd page listed here (and you're concerned enough to check it out), double-check it against the PageRank listed in the Google Toolbar. Above all, as I implied earlier, don't let this report be your sole indicator that something's wrong.
The HTML suggestions report is a valuable look at your title and meta description data.
Meta description: This report indicates meta descriptions that are duplicated, too long, or too short.
Don't be thrown by the numbers. If GWT says you have 600 pages with duplicated descriptions, it might mean that there are 300 different instances of two pages having duplicate data, not necessarily 600 pages that have the exact same description.
It's not clear what threshold indicates a description that is too long or too short. For example, I have a client site with a specific page whose description is seven words long, but I'm satisfied that its brevity is warranted and its description is adequate for the goal of the page. GWT lists it under the "short meta descriptions" report, but I have no plans to change it. On the other hand, if I saw descriptions that were labeled "too long," I would be more likely to edit them.
Title tag: This section has indicators for missing, duplicate, long, and short title tags, all of which are fairly self-explanatory even though, as with meta descriptions, it's hard to tell exactly what character or word count merits inclusion. As with meta descriptions, I would be more likely to fix titles that are considered too long than those that are too short, but consider each of them on a page-by-page basis.
I must plead ignorance to the category of titles known as "non-informative title tags." I can assume only that Google will inform you if you have words in your title that are overwhelmingly repetitive or do nothing to help describe the content on your page. In looking at many different site profiles, I couldn't find an example of such a title in any of them.
If GWT reports that your site has significant amounts of non-indexable content, take a close look at the URLs it reports. If the URLs represent data that you do in fact want to have indexed, consider a data format better suited for crawling. If the URLs don't represent such data, consider excluding them via robots.txt.
I hope you've learned some useful information in this series of articles on Google Webmaster Tools. Part 1 discussed the "Site configuation" section. Part 2 discussed the reports on "Your site on the web." Part 3 began the discussion of the "Diagnostics" section by talking about the "Crawl errors" reports.
Join the Industry's Leading eCommerce & Direct Marketing Experts in Chicago
ClickZ Live Chicago (Nov 3-6) will deliver over 50 sessions across 4 days and 10 individual tracks, including Data-Driven Marketing, Social, Mobile, Display, Search and Email. Check out the full agenda and register by Friday, August 29 to take advantage of Super Saver Rates!
Erik Dafforn is the executive vice president of Intrapromote LLC, an SEO firm headquartered in Cleveland, Ohio. Erik manages SEO campaigns for clients ranging from tiny to enormous and edits Intrapromote's blog, SEO Speedwagon. Prior to joining Intrapromote in 1999, Erik worked as a freelance writer and editor. He also worked in-house as a development editor for Macmillan and IDG Books. Erik has a Bachelor's degree in English from Wabash College. Follow Erik and Intrapromote on Twitter.
IBM Social Analytics: The Science Behind Social Media Marketing
80% of internet users say they prefer to connect with brands via Facebook. 65% of social media users say they use it to learn more about brands, products and services. Learn about how to find more about customers' attitudes, preferences and buying habits from what they say on social media channels.
The Multiplier Effect of Integrating Search & Social Advertising
Latest research reveals 68% higher revenue per conversion for marketers who integrate their search & social advertising. In addition to the research results, this whitepaper also outlines 5 strategies and 15 tactics you can use to better integrate your search and social campaigns.