Demystifying Google Webmaster Tools Reports, Part 1

  |  December 23, 2009   |  Comments

A look at GWT site configuration reports and why they are important. First in a three-part series.

Google Webmaster Tools contains a trove of analytical and diagnostic data for your sites. But is there too much information there? For many marketers, there's so much data that it's often hard to know which reports and features are "need to have" and which are "nice to have."

Google Webmaster Tools (GWT) reports are broken into three categories: "Site configuration," "Your site on the web," and "Diagnostics." (This naming is unfortunate, since all three sections have at least some diagnostic data.) In today's column, I'll walk through the reports in the "Site configuration" category and explain what the reports are, what they tell you, and how (and when) to act on them.

Before getting into the subcategories, however, let's look at the top right of your GWT dashboard. There you'll see an envelope icon with a number next to it. This is your "Message" area and it's a record of the conversations between you and the GWT team. Check this area regularly, as the GWT team may send a message to you here if there's a problem with any of your sites. This location also hosts any site reconsideration requests you've submitted in the past.

Sitemaps

"Sitemaps" is the first subcategory within the site configuration category. From this area, you can submit an XML site map that you've already uploaded to your site. (Note: "submit" in this context means "inform Google about," not "upload." To submit a site map using this form, the file must already exist on your server.) The interface was recently updated so that you no longer need to tell Google what type of site map it is. Instead, Google now detects the type.

Don't be alarmed if the number of reported "Indexed URLs" is smaller than the number of "URLs Submitted." I have access to dozens of sites' data, and not a single site has a 100 percent indexing rate. Historically, this report has been a bit buggy. Also remember that Google will, at its discretion, decide whether to index URLs that are too similar to others.

Two other important columns are "Downloaded" and "Status." The most recent "Downloaded" date may range from a week or more ago to only a few hours ago. If you've uploaded a new site map since the last download date, be sure to check the box next to the site map and click the "Resubmit" button. In addition, many tools exist to automatically update your site map and ping Google each time you add content to your site.

The "Status" column will contain either a green checkmark graphic or a red "x" graphic. This tells you whether your file is valid (green checkmark) or invalid or missing (red "x"). Remember that a green checkmark does not necessarily mean that all your URLs are correct or indexed. It means only that the site map file you submitted contains valid XML.

Crawler Access

The "Crawler access" area is your robots.txt file command center. In the "Test robots.txt" tab, you can see the status of your file (when it was last downloaded and whether it's valid). You can also test specific URLs against your existing robots.txt file to see whether they'll be excluded under your existing file's commands. And nicest of all, you can use this area as a laboratory to tweak your robots.txt commands and test them against any URLs you want until you have your file just right.

The "Generate robots.txt" tab enables you to create a robots.txt file that includes and excludes any specific files or directories you want, from any robot you want. You can create exclusion rules with robots other than Google's own herd. But remember that if it's a non-Google robot, you'll need to know the name of it, because non-Google crawlers are not included in the dropdown. Here's a good list of major crawlers and their user-agents.

The "Remove URL" tab lets you request that a specific file be removed entirely from Google's index. After submitting a request, this area then shows the status of the request. Remember that this process is secondary to actually doing your best to delete the file on your own.Google's requirements for removing a URL state that any URL that you delete using this report must first return a 404 and/or be blocked by the robots.txt file before requests will be accepted.

Sitelinks

If you have sitelinks, you're probably pretty happy with them. If you don't, then this section won't have any options for you.

But if Google shows a sitelink for your site that you'd prefer it not show (such as an internal login link), you can use this feature to tell Google to no longer show that link in SERPs (define). Simply click the "block" button by the appropriate URL, and the link will no longer appear in sitelinks for your queries.

If you choose to block a certain URL from the list of sitelinks, Google may either replace it with another URL of its choosing, or it may simply leave that slot blank and not replace it.

Change of Address

Moving to a new office or residence? Don't bring that information here, because that's not what it's for. Instead, this is a way to tell Google when your site has undergone a full domain migration from an old domain to a new one.

You still need to perform the necessary 301 redirects on your server to switch domains, and this feature is available to you only if you're verified in GWT for both the old and new domain, but in its own words, it "lets you notify Google when you are moving from one domain to another, enabling us to update our index faster and hopefully creating a smoother transition for your users." The bottom line is, this feature isn't a replacement for old-school domain migration. Instead, it supplements it and ideally makes it more efficient.

Settings

The "Settings" subcategory lets you control three separate aspects of Google's crawling, indexing, and ranking of your pages. The "Geographic target" section allows you to focus your site's traffic on a single country. If you're geo-targeting, and want traffic only from a single nation, I recommend using this tool and watching this video.

The "Preferred domain" section enables you to dictate how your URLs appear in Google search results, either with or without the "www" prefix. This is not the same as a canonical redirect. Instead, it's only a suggestion and affects the search results only cosmetically. If you have www-based canonical issues, I still strongly urge you to fix them.

The "Crawl rate" section lets you tell Google how much restraint Googlebot should use while crawling your site. Leave this option alone unless you're having significant problems on one end, with Googlebot crashing your server because it's crawling too many pages too quickly, or on the other end, with so many pages (and spare processing power) that you want Google to turn up the speed.

Conclusion

In subsequent columns, I'll get into the grittier aspects of diagnosing specific problems like crawl errors, reclaiming links lost to 404 errors, and other fun tasks. In the mean time, if you have specific questions about the reports covered today, please leave a note in the comments section.

Note: Also, see parts two, three, and four, of this series.

Erik is off today. This column was originally published July 22, 2009 on ClickZ.

ClickZ Live Chicago Join the Industry's Leading eCommerce & Direct Marketing Experts in Chicago
ClickZ Live Chicago (Nov 3-6) will deliver over 50 sessions across 4 days and 10 individual tracks, including Data-Driven Marketing, Social, Mobile, Display, Search and Email. Check out the full agenda and register by Friday, August 29 to take advantage of Super Saver Rates!

ABOUT THE AUTHOR

Erik Dafforn

Erik Dafforn is the executive vice president of Intrapromote LLC, an SEO firm headquartered in Cleveland, Ohio. Erik manages SEO campaigns for clients ranging from tiny to enormous and edits Intrapromote's blog, SEO Speedwagon. Prior to joining Intrapromote in 1999, Erik worked as a freelance writer and editor. He also worked in-house as a development editor for Macmillan and IDG Books. Erik has a Bachelor's degree in English from Wabash College. Follow Erik and Intrapromote on Twitter.

COMMENTSCommenting policy

comments powered by Disqus

Get the ClickZ Search newsletter delivered to you. Subscribe today!

COMMENTS

UPCOMING EVENTS

Featured White Papers

IBM: Social Analytics - The Science Behind Social Media Marketing

IBM Social Analytics: The Science Behind Social Media Marketing
80% of internet users say they prefer to connect with brands via Facebook. 65% of social media users say they use it to learn more about brands, products and services. Learn about how to find more about customers' attitudes, preferences and buying habits from what they say on social media channels.

Marin Software: The Multiplier Effect of Integrating Search & Social Advertising

The Multiplier Effect of Integrating Search & Social Advertising
Latest research reveals 68% higher revenue per conversion for marketers who integrate their search & social advertising. In addition to the research results, this whitepaper also outlines 5 strategies and 15 tactics you can use to better integrate your search and social campaigns.

Resources

Jobs

    • Product Specialist
      Product Specialist (Agora Inc. ) - BaltimoreDescription: The Product Specialist is hyper-focused on the customer experience and ensures that our...
    • Partnerships Senior Coordinator
      Partnerships Senior Coordinator (Zappos.com, Inc.) - Las VegasZappos IP, Inc. is looking for a Partnerships Senior Coordinator! Why join us? Our...
    • Assistant Product Listing Ads (PLA) Manager
      Assistant Product Listing Ads (PLA) Manager (Zappos.com, Inc.) - Las VegasZappos IP, Inc. is looking for an Assistant Product Listing Ads (PLA...