Defining Search Technology

  |  January 30, 2002   |  Comments

In this second installment of Paul's series on the major search engines, he gives us the dirt on the ever-popular Google.

Most people like Google because it's easy to use, it's fast, it has a huge database, and -- most important -- it works. Remember when Google hit the scene? It was 1998. Stanford computer science Ph.D. candidates Sergey Brin and Larry Page were working on a class project to identify meaningful patterns in Web link structure. They became fascinated with analyzing "backlinks" (pages linked back to a site) and realized these backlinks could help build a better mousetrap.

What's in a Name?

"Google" is a play on the word "googol," which was coined by Milton Sirotta, nephew of American mathematician Edward Kasner, to refer to the number represented by one followed by 100 zeros. Google's use of the term reflects the company's mission to organize the immense amount of information available on the Web. When Brin and Page presented their idea to the first angel investor, this investor wrote the check to "Google Inc." After thinking about it for weeks, they figured they'd better open an account in the name of Google Inc. to be able to cash the check. So the legend goes...

The PageRank Phenom

It's been said that Google changed the face of Internet search with an algorithm known as PageRank. The PageRank algorithm was definitely a technological breakthrough, as most major search engines now use link popularity as part of their relevancy algorithm. So how does it work?

"Google's PageRank search technology works by first identifying the link structure of the entire Web, then ranking individual pages based on the number and importance of pages linked to them," said Google software engineer Matt Cutts. My perception when talking to Cutts was that importance (the popularity and relevance of the backlink) counts more than the number of backlinks.

Is There a Weak Spot?

If any, it is that Google works better on searches for specific information (such as "rainfall in Hong Kong") than for general information (such as "Bible"), because search results aren't categorized, making the results a bit unwieldy for broad search terms. The Google directory helps, as the directory results appear above all search results.

Newer search tools, such as AllTheWeb, Teoma, and WiseNut, classify their results by category. For instance, Teoma divides search results for "Bible" into the following folders: Bible Study; King James Version; Holy Bible; Virginia Textbook; Bible Prophecy; Versions, Search; Biblical Resource; and First Letter. Most searchers, as a rule, don't narrow down their queries properly because they're not used to conducting research.

Can you get greater relevance by categorizing results, and, if so, will Google follow the trend toward categorization? "Google is in its second generation of experimenting with category-based results," explained Cutts. "Users apparently do not like having too many category options, but presenting clear and concise categories is important to users."

The Road to Success

Google achieved its success and profitability through two sources of revenue: advertising and search services. The AdWords program is targeted and effective, currently yielding up to five times the average click-through rate (CTR) for traditional banner ads. Cutts reiterated the Google mantra: We do not offer paid inclusion.

For additional revenue, Google provides search services to major Web portals and corporate Web sites. It has over 130 customers in more than 30 countries. These customers include Yahoo and its international properties, Sony and its global affiliates, AOL/Netscape, Cisco Systems, and others. These partners pay Google an upfront search service fee and per 1,000 results delivered to power search on their respective portals or corporate Web sites. For every search conducted on partner sites, Google receives a fee.

The Enhanced Google Toolbar

Since Google released a beta version a few months ago, several million people have downloaded the Google Toolbar. The toolbar allows users to vote on site popularity. This could give Google a reading on site popularity based on opinion rather than link structure alone. However, selective bias is a problem.

You can download the beta version, which allows you to rank search results with a voting button. When asked about incorporating this info into the algorithm, Cutts said, "Rather than using the votes to tinker with the specific rankings of particular pages or sites, the feature would most likely be used to bolster the relevance of overall results." Cutts indicated that data collected so far is promising, but it would take months before the preliminary data could be of conclusive value.

How Does Google Rank Web Sites?

Basically, it ranks sites by the words listed on each page and the key phrases used in the page's title and description. The spider looks at about 25 factors, including the keyword and description meta tags. It also ranks the page's popularity, which is determined by the number and importance of sites linked to the page.

When asked how to gain high rankings, Cutts replied, "The guidelines are pretty simple: Stay away from hidden text, hidden links, cloaking, sneaky redirects, lots of duplicate content on different domains, and doorway pages. Webmasters should also stay away from programs that send automatic queries to Google. The worst thing you can do is try to cheat: Shortcuts to boost PageRank or rankings usually do more harm than good. Even if an SEO [search engine optimizer] does think he's found a shortcut, about two-thirds of the time it may be a sting operation. Don't bother with link exchanges, signing guest books, or other tricks -- the best use of a Webmaster's time is building good content -- and honestly promoting their [sic] site. When Google punishes spam like cloaking, we sometimes take out not only the cloaked domain but the SEO's client as well."

A Look Into the Future

Google is working toward providing a deeper, fresher, and more personalized index. "The future will be about features and more about the overall usefulness of an engine," said Cutts. "We believe users want relevancy, but they also want quick, clean results with proven integrity," he added. When asked about XML, Cutts replied, "Not any time soon. The main benefit of HTML is that anyone can write it. That's part of why the Web had such meteoric growth. XML is great for machine-to-machine communication, but it's much more difficult for a person to produce by hand."

During the coming year, Google hopes to increase its lead across the board. "We'll be introducing new ways to search. We don't want to give away any secrets, but Google will provide many helpful surprises in 2002," volunteered Cutts. I understand the company's focus will be on search and the user experience.

How Deep?

Google does its share of indexing the deep Web by rolling out support for hundreds of file formats found there: PDF, RTF, PostScript, Word, Excel, PowerPoint, and more. It crawls millions of dynamic pages. Google indexes 3 billion Web documents every 28 days and conducts a fresh crawl of more than 3 million important Web pages each day. Google's news crawl provides up-to-the-minute headlines for news queries, and a subset of its fresh news content is available here.


Paul J. Bruemmer

Paul J. Bruemmer is CEO of Web-Ignite Corporation, a search engine optimization (SEO) and positioning provider. Founded in 1995, Web-Ignite has helped promote over 15,000 Web sites and was recognized by ICONOCAST as one of the top 10 most reputable SEO firms. Services include optimization, submission, registration, positioning, monitoring, maintenance, paid-inclusion, and paid-placement management for fixed monthly fees. Recent client testimonials report search engine traffic increased from 150 to 500 percent.

COMMENTSCommenting policy

comments powered by Disqus

Get the ClickZ Search newsletter delivered to you. Subscribe today!



Featured White Papers

2015 Holiday Email Guide

2015 Holiday Email Guide
The holidays are just around the corner. Download this whitepaper to find out how to create successful holiday email campaigns that drive engagement and revenue.

Three Ways to Make Your Big Data More Valuable

Three Ways to Make Your Big Data More Valuable
Big data holds a lot of promise for marketers, but are marketers ready to make the most of it to drive better business decisions and improve ROI? This study looks at the hidden challenges modern marketers face when trying to put big data to use.