An Eye on Third-Generation Search
AltaVista's philosophy going forward.
AltaVista's philosophy going forward.
Flash back to the spring of 1995. That’s when Digital Equipment’s Palo Alto Research Lab scientists developed the first searchable full-text database of the World Wide Web. It took the Web by storm later that year as AltaVista, search engine extraordinaire. “AltaVista” means “a view from above,” signifying “big ideas and a fascination with keeping track of information,” according to the site’s “About” page Since those heady days, a lot has changed. It was first acquired (with Digital Equipment) by Compaq Computer in 1998, then by CMGI in 1999.
AltaVista showed an early interest in serving the international community. “We provided the first multilingual search capabilities on the Web in 1997,” said AltaVista Product Marketing VP Chris Kermoian. “Right after that, we introduced Babel Fish, a translation service that includes Chinese, Japanese, and Korean translations.”
Starting in 1999, AltaVista launched local international sites in Germany, Sweden, the U.K., France, Holland, Italy, and Denmark, to name a few. Today, it maintains over 20 country-specific indices, including its main index. More than half of AltaVista’s Web traffic originates outside the United States, making it an excellent vehicle for international visibility. It performs 50 million search queries per day in over 25 languages.
“We were also one of the first to develop multimedia search technology,” said Kermoian. “AltaVista introduced new image, audio, and video search centers in early 2000, creating the most extensive multimedia library on the Web at the time.” Today, over 20 file types are indexed in various AltaVista collections (most are the various multimedia types: MP3s, JPGs, GIFs, Real Audio or Video, Microsoft Windows Media, Apple QuickTime, etc.).
Diversified Revenue Model
The key to AltaVista’s survival through last year’s downturn was its focus on its core capability of providing Internet search services and enterprise search software. AltaVista shifted from a full-service portal back to its pure search roots, further diversifying its business-to-business (B2B) revenue model by entering the highly promising market of enterprise search software.
“Like many others in the industry,” said Kermoian, “AltaVista has cut costs, streamlining its operations to right-size the business.”
Business units include the Web-based consumer search engine, Internet search services (Web-wide search capabilities for Internet sites), paid-inclusion programs (Trusted Feed, Express Inclusion, Listing Enhancements, etc.), various advertising programs, and its highly successful software division.
The software division provides enterprise-grade search software to clients such as Amazon.com, Borders, the FBI, and NASA, enabling these organizations to turn information into power, cataloging, and rapid data access across organizational networks (intranets, extranets, databases, and customer-facing Web sites). “The growth of AltaVista’s search software division has been a significant revenue source over the past two years,” said Kermoian. “We have licensed information-access and data-retrieval software to more than 1,200 leading companies.”
AltaVista’s paid-inclusion programs also help boost revenues because they must be renewed every six months. Express Inclusion lets you add up to 500 pages at $39 for the first URL (less for subsequent pages). It includes weekly updates for freshness. Trusted Feed is for sites of more than 500 pages and is based on cost-per-click pricing. It also includes weekly updates, and you can submit custom titles, keywords, and abstracts. Both of these programs index dynamic content, giving ad customers deep visibility and users deep Web content.
What Makes AltaVista Tick?
“In the past year, we have doubled the size of our index,” said Kermoian. “We’ve improved our news search and launched a comparison shopping search.” AltaVista has also expanded search results to include additional digital resources and data such as white and yellow pages listings, blended stock quotes, images, maps, and multimedia files.
“Freshness is a major priority,” said Kermoian. “In the last three weeks alone, we’ve added or refreshed over 270 million pages in our index, not including those URLs from our paid-inclusion programs, which are refreshed weekly. In addition, free submissions of new URLs are added on a continual basis, multimedia updates come in daily, and we make 15-minute updates to our news index,” he added.
AltaVista crawled and evaluated 4 billion pages for its current full-page index of 700 million pages. Below is a breakdown on freshness:
|Content Type||Update Frequency|
|News feeds||15 minutes|
|Basic submit URLs||Ongoing|
|Multimedia index||Weekly partial update|
|Index build||Weekly partial update|
How does AltaVista plan to improve the user experience? “We continue to refine our relevancy algorithm, increase the size of our database, and improve our user interface,” said Kermoian. “Going forward, we envision truly user-centric Internet-search and enterprise-grade-information access and data-retrieval technologies, or third-generation search.”
What Is Third-Generation Search?
“AltaVista’s goal is to deliver best-of-breed third-generation search,” explained Kermoian. “First-generation search basically ranked sites based on page content. Second-generation search focused on link analysis, taking the structure of the Web into account. Third-generation search will go beyond the Web, taking additional factors into account, such as time of day, geography, previous searches, stated preferences, and so forth.”
Want High Rankings?
What does it take to get top listings on AltaVista? “Your best bet is to ensure that your content is valuable and properly presented,” said Kermoian. “Developing proper meta-tags is important.” AltaVista uses titles, descriptions, and keywords in its ranking algorithm. “Choosing keywords that target specific queries related to page content is effective,” continued Kermoian, “for instance, if a client’s site is about golf, the meta-tags and site content should reflect this.”
“Avoid spamming or your site may be removed from our index,” he warned. “Sites created specifically for search engine robots should be avoided. AltaVista will index doorway pages only if used as a navigational aid to the rest of the site (e.g., BMW.com) and will not condone doorway spam pages.” He cited a good rule of thumb: “If a site has one doorway page that helps direct users to quality content within, that’s acceptable. But creating multiple doorway pages that send the user to the exact same site is clearly spamming.”
The Role of XML
Like others we’ve asked, AltaVista does not think XML will take over in the near term (four to five years) because of the prevalence of HTML. “If XML becomes pervasive, AltaVista is prepared,” asserts Kermoian. “We currently use XML extensively in our Internet Search Services program (providing search capabilities for other Internet sites) and the Trusted Feed program (receiving representations of Web pages from paid-inclusion customers), as well as many data interchanges with partners.”
Will AltaVista regain its glitter? Only time will tell. It doesn’t have the traffic it used to command, but it’s still a top choice for many serious researchers and is often cited as a source by librarians. Stay tuned — AltaVista plans to make an announcement on February 19.