Defining Search Technology

  |  January 30, 2002   |  Comments

In this second installment of Paul's series on the major search engines, he gives us the dirt on the ever-popular Google.

Most people like Google because it's easy to use, it's fast, it has a huge database, and -- most important -- it works. Remember when Google hit the scene? It was 1998. Stanford computer science Ph.D. candidates Sergey Brin and Larry Page were working on a class project to identify meaningful patterns in Web link structure. They became fascinated with analyzing "backlinks" (pages linked back to a site) and realized these backlinks could help build a better mousetrap.

What's in a Name?

"Google" is a play on the word "googol," which was coined by Milton Sirotta, nephew of American mathematician Edward Kasner, to refer to the number represented by one followed by 100 zeros. Google's use of the term reflects the company's mission to organize the immense amount of information available on the Web. When Brin and Page presented their idea to the first angel investor, this investor wrote the check to "Google Inc." After thinking about it for weeks, they figured they'd better open an account in the name of Google Inc. to be able to cash the check. So the legend goes...

The PageRank Phenom

It's been said that Google changed the face of Internet search with an algorithm known as PageRank. The PageRank algorithm was definitely a technological breakthrough, as most major search engines now use link popularity as part of their relevancy algorithm. So how does it work?

"Google's PageRank search technology works by first identifying the link structure of the entire Web, then ranking individual pages based on the number and importance of pages linked to them," said Google software engineer Matt Cutts. My perception when talking to Cutts was that importance (the popularity and relevance of the backlink) counts more than the number of backlinks.

Is There a Weak Spot?

If any, it is that Google works better on searches for specific information (such as "rainfall in Hong Kong") than for general information (such as "Bible"), because search results aren't categorized, making the results a bit unwieldy for broad search terms. The Google directory helps, as the directory results appear above all search results.

Newer search tools, such as AllTheWeb, Teoma, and WiseNut, classify their results by category. For instance, Teoma divides search results for "Bible" into the following folders: Bible Study; King James Version; Holy Bible; Virginia Textbook; Bible Prophecy; Versions, Search; Biblical Resource; and First Letter. Most searchers, as a rule, don't narrow down their queries properly because they're not used to conducting research.

Can you get greater relevance by categorizing results, and, if so, will Google follow the trend toward categorization? "Google is in its second generation of experimenting with category-based results," explained Cutts. "Users apparently do not like having too many category options, but presenting clear and concise categories is important to users."

The Road to Success

Google achieved its success and profitability through two sources of revenue: advertising and search services. The AdWords program is targeted and effective, currently yielding up to five times the average click-through rate (CTR) for traditional banner ads. Cutts reiterated the Google mantra: We do not offer paid inclusion.

For additional revenue, Google provides search services to major Web portals and corporate Web sites. It has over 130 customers in more than 30 countries. These customers include Yahoo and its international properties, Sony and its global affiliates, AOL/Netscape, Cisco Systems, and others. These partners pay Google an upfront search service fee and per 1,000 results delivered to power search on their respective portals or corporate Web sites. For every search conducted on partner sites, Google receives a fee.

The Enhanced Google Toolbar

Since Google released a beta version a few months ago, several million people have downloaded the Google Toolbar. The toolbar allows users to vote on site popularity. This could give Google a reading on site popularity based on opinion rather than link structure alone. However, selective bias is a problem.

You can download the beta version, which allows you to rank search results with a voting button. When asked about incorporating this info into the algorithm, Cutts said, "Rather than using the votes to tinker with the specific rankings of particular pages or sites, the feature would most likely be used to bolster the relevance of overall results." Cutts indicated that data collected so far is promising, but it would take months before the preliminary data could be of conclusive value.

How Does Google Rank Web Sites?

Basically, it ranks sites by the words listed on each page and the key phrases used in the page's title and description. The spider looks at about 25 factors, including the keyword and description meta tags. It also ranks the page's popularity, which is determined by the number and importance of sites linked to the page.

When asked how to gain high rankings, Cutts replied, "The guidelines are pretty simple: Stay away from hidden text, hidden links, cloaking, sneaky redirects, lots of duplicate content on different domains, and doorway pages. Webmasters should also stay away from programs that send automatic queries to Google. The worst thing you can do is try to cheat: Shortcuts to boost PageRank or rankings usually do more harm than good. Even if an SEO [search engine optimizer] does think he's found a shortcut, about two-thirds of the time it may be a sting operation. Don't bother with link exchanges, signing guest books, or other tricks -- the best use of a Webmaster's time is building good content -- and honestly promoting their [sic] site. When Google punishes spam like cloaking, we sometimes take out not only the cloaked domain but the SEO's client as well."

A Look Into the Future

Google is working toward providing a deeper, fresher, and more personalized index. "The future will be about features and more about the overall usefulness of an engine," said Cutts. "We believe users want relevancy, but they also want quick, clean results with proven integrity," he added. When asked about XML, Cutts replied, "Not any time soon. The main benefit of HTML is that anyone can write it. That's part of why the Web had such meteoric growth. XML is great for machine-to-machine communication, but it's much more difficult for a person to produce by hand."

During the coming year, Google hopes to increase its lead across the board. "We'll be introducing new ways to search. We don't want to give away any secrets, but Google will provide many helpful surprises in 2002," volunteered Cutts. I understand the company's focus will be on search and the user experience.

How Deep?

Google does its share of indexing the deep Web by rolling out support for hundreds of file formats found there: PDF, RTF, PostScript, Word, Excel, PowerPoint, and more. It crawls millions of dynamic pages. Google indexes 3 billion Web documents every 28 days and conducts a fresh crawl of more than 3 million important Web pages each day. Google's news crawl provides up-to-the-minute headlines for news queries, and a subset of its fresh news content is available here.

ClickZ Live San Francisco This Year's Premier Digital Marketing Event is #CZLSF
ClickZ Live San Francisco (Aug 11-14) brings together the industry's leading practitioners and marketing strategists to deliver 4 days of educational sessions and training workshops. From Data-Driven Marketing to Social, Mobile, Display, Search and Email, this year's comprehensive agenda will help you maximize your marketing efforts and ROI. Register today!

ABOUT THE AUTHOR

Paul J. Bruemmer

Paul J. Bruemmer is CEO of Web-Ignite Corporation, a search engine optimization (SEO) and positioning provider. Founded in 1995, Web-Ignite has helped promote over 15,000 Web sites and was recognized by ICONOCAST as one of the top 10 most reputable SEO firms. Services include optimization, submission, registration, positioning, monitoring, maintenance, paid-inclusion, and paid-placement management for fixed monthly fees. Recent client testimonials report search engine traffic increased from 150 to 500 percent.

COMMENTSCommenting policy

comments powered by Disqus

Get the ClickZ Search newsletter delivered to you. Subscribe today!

COMMENTS

UPCOMING EVENTS

Featured White Papers

BigDoor: The Marketers Guide to Customer Loyalty

The Marketer's Guide to Customer Loyalty
Customer loyalty is imperative to success, but fostering and maintaining loyalty takes a lot of work. This guide is here to help marketers build, execute, and maintain a successful loyalty initiative.

Marin Software: The Multiplier Effect of Integrating Search & Social Advertising

The Multiplier Effect of Integrating Search & Social Advertising
Latest research reveals 68% higher revenue per conversion for marketers who integrate their search & social advertising. In addition to the research results, this whitepaper also outlines 5 strategies and 15 tactics you can use to better integrate your search and social campaigns.

WEBINARS

Jobs

    • Interactive Product Manager
      Interactive Product Manager (Western Governors University) - Salt Lake CityWestern Governors University, one of the 20 largest universities...
    • SEO Senior Analyst
      SEO Senior Analyst (University of Phoenix (Apollo Education Group)) - San FranciscoSEO Senior Analyst   Position Summary...
    • SEM & Biddable Media Manager
      SEM & Biddable Media Manager (Kepler Group LLC) - New YorkAs an Optimization & Innovation Manager at Kepler Group, you will be on the bleeding...