Ways to find out whether engines see the same site you do.
Webmaster tools and third-party applications are generally a great way to get accurate search engine diagnostics, but sometimes simpler is better. Today, some quick diagnostic advice to help you efficiently pinpoint issues that may affect your organic search performance. You don't need any expensive tools, and you don't need to download any software. All you need is an Internet connection and a browser.
The Versatile "Site:" Operator
Indexing is one of the first big search engine obstacles to overcome. After all, your indexed pages may or may not rank and drive search traffic, but your unindexed pages absolutely won't rank. Most people know that to view the number of site pages indexed by the engines you use this query in the search box: site:www.yourdomain.com, such as site:www.clickz.com.
But here's an important caveat about this powerful operator. If your site contains 5,500 pages and a site: query shows that Google or Yahoo has indexed 5,500 pages, you can be happy, but only for a few seconds. The 5,500 pages on your site and the 5,500 pages indexed might not be the same 5,500 pages.
We recently helped a site clean up some canonical issues in which four different versions of the home page had been crawled and indexed. This won't break a site, but it's a housekeeping issue that tops my list of items that can nickel-and-dime your site's authority away.
It's a good idea to use the site: operator both with and without the "www" prefix in the query string. Typically, this will tell you the extent to which engines have indexed non-www pages, which can tip you off to canonical issues on your site. If site:www.yourdomain.com and site:yourdomain.com show significantly different numbers, you know that some non-www links are probably leaking out and resulting in several non-www pages being crawled.
This query: site:yourdomain.com -inurl:www will show you the subset of indexed pages on your site that don't have "www" in their URLs. If you have multiple subdomains on your site, this becomes slightly trickier to diagnose. For example, if you have subdomains called "www," "blog," and "clients," you'll need to add those subdomains to the preceding query to find canonical issues: site:yourdomain.com -inurl:www -inurl:blog -inurl:clients.
Currently, both the site: and inurl: operators work in Google, Yahoo, and Live Search. Depending on your exact query at Yahoo, the engine might redirect you to Yahoo Site Explorer, but it will still show you the answer to your query.
Narrowing the Scope of the Site: Operator
If the site: operator shows that 9 million of your site's pages are indexed, Google will give you the 9 million figure but will show you only 1,000 specific URLs. To see deeper into specific parts of your site, you need to tell engines exactly which part of the site you're trying to examine. To do this, you have a few options.
If the pages are all within a specific folder on the site, you can simply add the folder name to the site: operator. For example, if you have a section on your site called /services/ with a series of pages within that folder, this query will show how many of those pages are indexed: site:www.yourdomain.com/services/.
Using this method is more reliable than a query like site:www.yourdomain.com inurl:services, because the latter will show URLs in which "services" appears anywhere in the URL. Contrast the results of these two queries to see the major difference: site:www.amazon.com/review and site:www.amazon.com inurl:review.
A reader recently contacted me, saying that her ability to use the cache: operator at Google was frequently thwarted by a 403 error, which stated that her query looked either automated or spy/malware-related.
Know that if this happens to you, it should reset itself within a few hours -- typically no more than 12. But if it remains constant, someone on your IP address (i.e., at your company) might be running a tool that sends automated queries to Google. If this is the case, get them to stop (or outsource the reports), since they're hindering your ability to get actual, helpful data.
The site: operator is one I use dozens of times each day. In part two, I explain how to use additional powerful operators to diagnose search engine issues, including variations of the cache: operator, which can show you your pages exactly the way engines view them.
Today's column originally ran on July 23, 2008.
Does your company or client offer one of the best online marketing products or services? Nominate it now for one of the 2009 ClickZ Marketing Excellence Awards!
Join the Industry's Leading eCommerce & Direct Marketing Experts in Chicago
ClickZ Live Chicago (Nov 3-6) will deliver over 50 sessions across 4 days and 10 individual tracks, including Data-Driven Marketing, Social, Mobile, Display, Search and Email. Check out the full agenda and register by Friday, Oct 3 to take advantage of Early Bird Rates!
Erik Dafforn is the executive vice president of Intrapromote LLC, an SEO firm headquartered in Cleveland, Ohio. Erik manages SEO campaigns for clients ranging from tiny to enormous and edits Intrapromote's blog, SEO Speedwagon. Prior to joining Intrapromote in 1999, Erik worked as a freelance writer and editor. He also worked in-house as a development editor for Macmillan and IDG Books. Erik has a Bachelor's degree in English from Wabash College. Follow Erik and Intrapromote on Twitter.
IBM Social Analytics: The Science Behind Social Media Marketing
80% of internet users say they prefer to connect with brands via Facebook. 65% of social media users say they use it to learn more about brands, products and services. Learn about how to find more about customers' attitudes, preferences and buying habits from what they say on social media channels.
The Multiplier Effect of Integrating Search & Social Advertising
Latest research reveals 68% higher revenue per conversion for marketers who integrate their search & social advertising. In addition to the research results, this whitepaper also outlines 5 strategies and 15 tactics you can use to better integrate your search and social campaigns.
September 17, 2014
September 23, 2014
September 30, 2014
1:00pm ET/10:00am PT