Search’s Real-Time Paradigm

There’s no doubt that Google’s Eric Schmidt has made some quotable statements over the years. “Speaking as a computer scientist, I view all of these as sort of poor man’s e-mail systems,” is what he said about Twitter earlier this year in an interview with Reuters. He dismissed the idea of Google acquiring Twitter. In fact, he said, “We’re unlikely to buy anything in the short term partly because I think prices are still high. And it’s unfortunate I think we’re in the middle of a cycle. Google is generating a lot of cash. And so we keep that cash in extremely secure banks.” Then last week Google went out and acquired reCAPTCHA.

Is it just me, or are Eric Schmidt and Google generally sidestepping the buzz of Twitter’s real-time search potential? The industry is all abuzz about real-time search. There are a whole lot of new services and innovations being launched that all profess to cut through the noise of real-time search.

One of the latest innovative services cutting through the global drone of Twitter’s nonstop chattering is the U.K.’s WorkDigital with its new TwitterJobSearch. (Disclosure: Work Digital is a U.K.-based company partially owned by Incisive Media, publisher of ClickZ.)

On a recent trip to the U.K., I spoke with William Fischer, director of WorkDigital, about the challenges of real-time search. He explained the signal/noise issue presents a huge challenge to search engines. But Work Digital’s technology, based on a combination of methods, primarily uses natural language processing to avoid depending purely on keywords.

“Other search engines rely upon ‘keywords’ or hash tags to index tweets. One hundred forty characters just doesn’t provide enough data unless you can make sense of the words in context. ‘San Francisco job construction job losses continue in Q4’ could very well appear to be an offer of employment or an employment news item if one doesn’t use natural language processing to evaluate the words in context,” Fischer said.

He added: “We also go beyond the tweet. Our product spiders the pages that are linked to in order to help establish relevancy and to grab meta-data that we can then associate back with the tweet. ‘Marketing associate needed for hot pr firm’ could be a useful tweet, provided one knows what city/country/skills/pay, etc., are associated with it. Even though it is a clear employment offer, there is simply not enough data for it to be useful on its own.”

The system then checks other tweets by the same individual and uses crowdsourcing to weed inappropriately indexed items and then builds a relevancy algorithm to create a kind of “TweetRank.”

Of course, the service has a way to go, but as it matures and grows its rapidly changing index, it is likely to be a forerunner to many similar verticals that will emerge in the social search sector.

Another fascinating approach is the IBM/BBC venture, Sound Index. Having spent some time in the music industry in a previous life, I’ve always found the music charts to be flawed and frequently manipulated (a bit like search engine rankings, I guess!). For the past 50 years, music charts around the world have been compiled using a combination of retail sales and radio-listener statistics. However, as music retailing has had a seismic shift to downloads as the favored method for a younger audience, chart creators are keen to incorporate the “wisdom of crowds” to generate current music business intelligence. In turn, music retailers are keen to incorporate these observations into their marketing strategies to stay relevant and generate more sales.

Sound Index takes a much different approach to real-time search by tapping into broken-English-text analytics technology for integrating information from different modalities and ranking technologies. Sound Index claims to be the first industrial strength implementation of the complex idea of combining “dirty” multimodal data, using what it refers to as “unstructured information management architecture and data mining.” I’m writing quite a lot about this type of approach and how the changes these new approaches to search will affect online marketing strategies and search optimization in particular. There’s far too much to discuss in one column, but the gist is that there’s a paradigm shift in search that is as exciting now as Google’s introduction of PageRank was back in 1998.

There was a lot of chatter in the blogosphere last week, including this post, about being able to get Google results in near real time (albeit extraordinarily complicated for the average end user). However, even if you view Google results from the past second, what you see are results from its recent crawl and not real-time information. Remember, Google only has a fraction of the Web to work with and only in time with the speed of its crawling and indexing processes.

One quick and dirty way to try to get an idea of the difference between Twitter’s real-time results and Google’s is to install this Greasemonkey script. You’ll see the top five Twitter results right above the Google results in your browser. Not rocket science, but a very interesting way to get a feel for how SERPs (define) may change in the future.

Search ads and display ads offer a powerful one-two punch for your marketing plan. Join us on Wednesday, September 30, 2009, at 1 p.m., for a free Webinar to hear how recent studies show that search and display advertising used together can drive sales more effectively than either channel by itself.

Related reading

Brand Top Level Domains