Demystifying Google Webmaster Tools Reports, Part 4

  |  September 2, 2009   |  Comments

Examining the "Diagnostics" reports, from crawl stats to HTML suggestions. Last in a series.

This is the final article in a series covering Google Webmaster Tools (GWT) reports and how to interpret them.

Crawl Stats

Crawl stats reporting shows you the issues encountered by Googlebot as it tries to make its way through your site. It contains the following sub-reports:

Pages crawled per day: This report shows you, over the last three months, how many pages Googlebot has requested day by day. There's no right number in this chart, but zero is certainly the wrong number. Because Google's crawl frequency is due in large part to PageRank of the pages on your site, you'll likely see the frequency increase as large spikes in this report are often due to the introduction of many new pages and/or a large influx of inbound links, possibly due to offline promotions or some other traffic generator.

Kilobytes crawled per day: This graph won't match the "Pages crawled per day" graph exactly, but it should exhibit similarities, such as the same rough peaks and valleys.

Time spent downloading a page (in milliseconds): This graph shows you the average time it takes for Googlebot to pull down a specific URL from your server. Typically, the peaks and valleys of this graph have no relation to the preceding two graphs discussed above. In fact, a peak on this graph often indicates a server problem, because unless your pages are monstrously large, Googlebot should not take very long to download any of them.

PageRank of your pages in Google: This bar chart takes all of your indexed pages and gives you the proportions that fall into one of the following four PageRank categories: high, medium, low, and not yet assigned.

Of all the GWT reports to take with a grain of salt, this is one of the tops. Pages need a PageRank of at least a 7 or 8 to fall into the "High" category, so don't expect too many of yours to be there. For the vast majority of sites, most web pages will fall into the "Low" or "Not yet assigned" categories, and this by itself is usually nothing to worry or even care about, because if you're in the game for the PageRank, you're missing some critical strategy.

About the only practical use for this report is to find that six or more months have passed between the addition of large amounts of content and them still sitting in the "Not yet assigned" bucket. This could indicate some crawling or indexing obstacles or the need for additional links pointing into deep content.

Your page with the highest PageRank. First of all, this report has had a history of buggy behavior. Its goal, as the name suggests, is to tell you which of the pages on your site has the highest PageRank. In 99 percent of all sites, that will be the home page, as you might expect. If you see an odd page listed here (and you're concerned enough to check it out), double-check it against the PageRank listed in the Google Toolbar. Above all, as I implied earlier, don't let this report be your sole indicator that something's wrong.

HTML Suggestions

The HTML suggestions report is a valuable look at your title and meta description data.

Meta description: This report indicates meta descriptions that are duplicated, too long, or too short.

Don't be thrown by the numbers. If GWT says you have 600 pages with duplicated descriptions, it might mean that there are 300 different instances of two pages having duplicate data, not necessarily 600 pages that have the exact same description.

It's not clear what threshold indicates a description that is too long or too short. For example, I have a client site with a specific page whose description is seven words long, but I'm satisfied that its brevity is warranted and its description is adequate for the goal of the page. GWT lists it under the "short meta descriptions" report, but I have no plans to change it. On the other hand, if I saw descriptions that were labeled "too long," I would be more likely to edit them.

Title tag: This section has indicators for missing, duplicate, long, and short title tags, all of which are fairly self-explanatory even though, as with meta descriptions, it's hard to tell exactly what character or word count merits inclusion. As with meta descriptions, I would be more likely to fix titles that are considered too long than those that are too short, but consider each of them on a page-by-page basis.

I must plead ignorance to the category of titles known as "non-informative title tags." I can assume only that Google will inform you if you have words in your title that are overwhelmingly repetitive or do nothing to help describe the content on your page. In looking at many different site profiles, I couldn't find an example of such a title in any of them.

Non-indexable content: As with the "Non-informative title tag," I had trouble finding examples of URLs that Google will actually report in this category. Finally I found a site that had several hundred entries here, and Google had labeled them as "images," when in fact they were partial page-tracking URLs that I assume had been discovered as Googlebot crawled through JavaScript code.

If GWT reports that your site has significant amounts of non-indexable content, take a close look at the URLs it reports. If the URLs represent data that you do in fact want to have indexed, consider a data format better suited for crawling. If the URLs don't represent such data, consider excluding them via robots.txt.

Conclusion

I hope you've learned some useful information in this series of articles on Google Webmaster Tools. Part 1 discussed the "Site configuation" section. Part 2 discussed the reports on "Your site on the web." Part 3 began the discussion of the "Diagnostics" section by talking about the "Crawl errors" reports.

ClickZ Live New York What's New for 2015?
You spoke, we listened! ClickZ Live New York (Mar 30-Apr 1) is back with a brand new streamlined agenda. Don't miss the latest digital marketing tips, tricks and tools that will make you re-think your strategy and revolutionize your marketing campaigns. Super Saver Rates are available now. Register today!

ABOUT THE AUTHOR

Erik Dafforn

Erik Dafforn is the executive vice president of Intrapromote LLC, an SEO firm headquartered in Cleveland, Ohio. Erik manages SEO campaigns for clients ranging from tiny to enormous and edits Intrapromote's blog, SEO Speedwagon. Prior to joining Intrapromote in 1999, Erik worked as a freelance writer and editor. He also worked in-house as a development editor for Macmillan and IDG Books. Erik has a Bachelor's degree in English from Wabash College. Follow Erik and Intrapromote on Twitter.

COMMENTSCommenting policy

comments powered by Disqus

Get the ClickZ Search newsletter delivered to you. Subscribe today!

COMMENTS

UPCOMING EVENTS

UPCOMING TRAINING

Featured White Papers

Google My Business Listings Demystified

Google My Business Listings Demystified
To help brands control how they appear online, Google has developed a new offering: Google My Business Locations. This whitepaper helps marketers understand how to use this powerful new tool.

5 Ways to Personalize Beyond the Subject Line

5 Ways to Personalize Beyond the Subject Line
82 percent of shoppers say they would buy more items from a brand if the emails they sent were more personalized. This white paper offer five tactics that will personalize your email beyond the subject line and drive real business growth.

WEBINARS

    Information currently unavailable

Resources

Jobs

    • Lead Generation Specialist
      Lead Generation Specialist (The Oxford Club) - BaltimoreThe Oxford Club is seeking a talented writer/marketer to join our growing email lead-generation...
    • Health Marketing Editor
      Health Marketing Editor (Agora Inc.) - BaltimoreCome flex your intellectual muscle as part of Agora, Inc’s (http://agora-inc.com/) legal team...
    • Marketing Systems Analyst
      Marketing Systems Analyst (OmniVista Health) - BaltimoreOmniVista Health is looking to add a Marketing Systems Analyst to our expanding team. We...