Yahoo!: Birth of a New Machine

Danny Sullivan’s SEW Award Winners series will continue next week. –The Editors

Yahoo’s rolled out a brand new search engine. It has its own index and ranking mechanisms, casting aside its long-standing use of Google-powered search results. The move is bound to roil the industry. It sets in motion a new race for the claim of Web search champion.

Ever since Yahoo’s acquisition of Inktomi nearly a year ago, speculation focused on when the company would replace its Google powered search results with results from Inktomi’s index.

Yahoo isn’t replacing Google with Inktomi. Rather, the company developed a brand-new search engine, drawing on lessons learned from what the company calls the “critical mass” of search engineering talent it assembled through hiring and acquisitions, as well as investment in infrastructure and product quality.

“High-quality, talented search engineers are in very short supply these days,” said Jeff Weiner, Yahoo senior VP of search and marketplace. “Regardless of how good your planning process is, at the end of the day it comes down to people and chemistry.”

Weiner said Yahoo waited until now to switch from Google to be certain users would have the best experience possible after the transition. “It was absolutely essential to us that we had a road map in place that not only let us sustain our quality but build on it.”

Although a change to self-powered search results is radical, Yahoo steadily made incremental improvements in its search capabilities for over a year. In October 2002, the company made the most significant change to its operation since its inception, replacing its human-compiled directory listings with Google search results.

Then, in April of last year, the company rolled out the new Yahoo Search, introducing a streamlined search page. It also added new tabs to search result pages offering access to its directory listings, news, images, and yellow pages.

Last week’s launch begins a progressive rollout that takes place over the next few weeks. It’s the start of numerous planned enhancements focusing on Web search, personalization, and vertical search.

Note the new search engine is for Web results only. Image search remains powered by Google. News search is still a combination of Yahoo’s own editorial and technological resources.

How does the Yahoo’s new search engine differ from Google? Results presentation is very similar. Yahoo wisely opted to keep things looking mostly the same, with a few exceptions. There’s a link to the cached copy of each indexed page — now served from Yahoo, not Google. Just about everything else on search result pages looks the same.

Actual results returned by Google and Yahoo depend on the query. For popular or common queries, there seems very little difference between the two engines in the top few results. Once past those, results tend to diverge dramatically. For less common or unpopular queries, Yahoo results look quite different from Google’s.

Although Yahoo and Google likely use similar algorithms, one reason for the differences is Yahoo’s email and search teams leverage what they’ve learned about spam. Yahoo mail processes billions of email messages, so this knowledge is likely quite helpful in providing Yahoo with a much deeper understanding of spam characteristics — and helps keep nasty stuff out the Web page index.

Bottom line: I’m impressed with the quality of the results Yahoo delivers. It’s a very viable alternative to Google and the other “last engine standing,” Ask Jeeves/Teoma.

What’s Indexed?

The Yahoo Search index captures the full text of Web pages, up to a 500K limit. That’s greater than the 101K maximum indexed by Google. A broad range of file types, including HTML, PDF, and Microsoft Office documents, is included in the mix.

How big is Yahoo’s index? The portal isn’t saying, despite Google’s recent announcement it’s expanded its index to nearly 4.3 billion documents (6 billion, counting images and newsgroup postings, which Google does).

In almost all of my tests with random queries, Yahoo reported more results found than Google. Does this mean Yahoo’s index is bigger? Perhaps. But reported results are estimates, not exact counts. They can include factors other than keyword matches, making them notoriously unreliable measures of overall index size. Suffice it to say Yahoo’s index is comparable to Google’s for most queries.

“We’re very confident in the quality and size of our index, and we think the results speak for themselves,” said Weiner.

What About AltaVista and AlltheWeb?

Last year, before Yahoo acquired Overture, Overture was busy acquiring AltaVista and AlltheWeb. Speculation was Overture would kill off AltaVista’s technology and power both search sites with AlltheWeb’s index.

To the contrary, both search engines maintain their independent indexes. In July 2003, Yahoo bought Overture. Less than a month later, Danny Sullivan and I visited AltaVista and AlltheWeb. We learned the plan was to unify the two search engines, keeping the strongest technologies from both.

Exciting news. Then, nothing seemed to change. Both AltaVista and AlltheWeb continue to maintain separate indexes. Yahoo isn’t saying publicly whether this will change with the introduction of the new Yahoo Search Technology index.

What’s Next?

In addition to working to improve the quality of its Web search results, Yahoo plans particular emphasis in coming months on personalization and vertical search. The company’s My Yahoo portal already offers extensive content customization options.

Newly released features such as the SmartSort option in Yahoo Shopping, which provides very specific product advice for digital cameras, MP3 players, computers, and other electronic devices based on criteria the user enters, is an example. The ability to add RSS feeds to the My Yahoo page is another.

“Ultimately we want to understand the intention of the user, and I think we’re going to get closer to that through personalization,” said Weiner.

In the vertical search arena, Yahoo plans to focus on local, travel, personals, and its Hot Jobs search portal.

These moves are clearly only the beginning of many more to come at Yahoo. “Over time, you’re going to see Yahoo extend our search technology, and ultimately into our media properties,” said Weiner. “To a large extent that will help drive our growth.”

All this gives Google, Ask Jeeves, and Microsoft’s fledgling Web search initiative good reason to be even more attentive to the quality of their search results. It promises to be a very good year for searchers.

This column was adopted from ClickZ’s A longer, more detailed version is available to paid Search Engine Watch members.

Want more search information? ClickZ Search Archives contains all our search columns, organized by topic.

Related reading