Lies, Lies, and LSI

Should SEOs lose sleep over latent semantic indexing?

Author

Mike Grehan

Date published October 2, 2006 Categories

It’s five years since I first referenced latent semantic indexing (LSI) ( define) and the work of Microsoft super scientist Susan Dumais in the first edition of my best practice guide to search marketing (or search engine positioning, as it was known then).

At the time, there was a whole lot of confusion and some very bad information floating around about the vector space model (developed by Dr. Gerard Salton), and exactly what term vectors are. A research paper entitled “The Term Vector Database: fast access to indexing terms for Web pages,” only seemed to add more fuel to the fire. People in the rapidly developing SEO industry openly speculated as to how this new technology would challenge and affect optimization efforts.

As I often pointed out in forums and newsletters back then, term vector theory wasn’t new at all (it predates the Web by some considerable time). I also referenced many times an interview I did with Brian Pinkerton, developer of WebCrawler, arguably the Web’s first full text retrieval search engine. Pinkerton explained to me he had applied the vector space model to WebCrawler from the very beginning. And that was back in 1994.

Latent semantic indexing has also been around for a very long time. One of the first papers I read on the subject dates back to 1990.

Recently, I received a spam message which declared:

SEMANTIC WEB VERSION II

Google is coming up with Semantic web. Are you ranking well with this latest algorithm of search engines and will you continue to rank well?

Is you website LSI compliant?

Search Engines like Google (who are pallbearers for technology) are already reaching out for it by adopting LSI in their ranking algorithms.

We will check your website for its LSI algorithm readiness.

What a complete crock of you-know-what.

I read a newsletter promoting LSI tools and technology for your Web site. It even referred to the term vector database (which I doubt ever worked anyway!). Most of these so-called LSI tools and technology are nothing more than parlor tricks. Anyone can knock together a tool that takes a query and runs a thesaurus look-up on it.

Should you lose any sleep over LSI?

I asked my buddy and SEO expert Rand Fishkin of the popular seomoz resource for his thoughts. I referenced Dr. Edel Garcia’s recent tutorials on LSI and SVD (which he had already was aware of) and basically I asked:

Should SEOs care about LSI anyway, should we lose sleep over it?

If we should care about it, how would we go about optimizing for it?

In the first case he said:

“Care about it, absolutely. Lose sleep over it, almost certainly no. LSI is a method for determining semantic relationships and in all honesty, while I do believe it’s critical for an SEO to be informed enough to explain the concept to a client, I don’t see a lot of practical use. With the advancement in search engine algorithms over the last 2-3 years (particularly at Google & Yahoo!), SEO has shifted away from manipulating language use and placement to building a savvy marketing campaign.”

And to the second question, he said:

“I believe that one of Dr. Garcia’s primary points when examining the math behind LSI is that without access to accurate data about the search engines’ indices and the use of language therein, we’re shooting in the dark to a certain degree. He’s laid out a process in his articles on the subject that will allow for rough calculations to uncover potentially more valuable combinations of words and phrases for optimizing text for search engines. However, as Dr. Garcia notes:

‘These days we know that most current LSI models are not based on mere local weights, but on models that incorporate local, global and document normalization weights. Others incorporate entropy weights and link weights.’

I’m inclined to believe the value we get out of “local” weight calculations for terms in a document provide only the most minimal value to SEOs.

However, this could be very useful to spammers writing programs to auto-generate text designed to pull in long tail searches and serve contextual ads – even a slight improvement in 50 million documents could turn to big $$ for that crowd.”

I asked Dr Garcia for his own thoughts.

“Many SEOs are misquoting old papers and the focus of that old research. Many of these SEO “experts” don’t even know how to do basic SVD decomposition, nor do they understand the how-to steps involved in computing LSI scores. In the process they have stretched such research findings and added a few of their own myths in order to market better whatever they sell. For instance, today one can see some suggesting that to have documents “LSI friendly” one needs to stuff content with synonyms or related terms. This perception is incorrect.”

So if your SEO vendor is throwing terms such as LSI at you, you should really get them to qualify what they actually know about the subject.

Take a look at Dr Garcia’s fast-track paper (download PDF) yourself. Even if you don’t grasp any of the math and only have a half a clue of what it’s all about, don’t worry. At least by reading it, you may never understand what it is or what it does: But Garcia certainly emphasizes what it isn’t. And that little bit of knowledge will certainly help you to dispel any BS thrown at you by snake oil SEOs.

Subscribe to get your daily business insights

More about:

Read the next article

Explore Tech Talks

Lucy

Lucy helps organizations leverage knowledge for in... View Tech Talk
TVSquared

TVSquared is the global leader in cross-platform T... View Tech Talk
Grata

Grata is a B2B search engine for discovering small... View Tech Talk

Whitepapers

US Mobile Streaming Behavior

Whitepaper | Mobile

US Mobile Streaming Behavior

Streaming has become a staple of US media-viewing habits. Streaming video, however, still comes with a variety of pesky frustrations that viewers are ...

View resource

Winning the Data Game: Digital Analytics Tactics for Media Groups

Whitepaper | Analyzing Customer Data

Winning the Data Game: Digital Analytics Tactics for Media Groups

Winning the Data Game: Digital Analytics Tactics f...

Data is the lifeblood of so many companies today. You need more of it, all of which at higher quality, and all the meanwhile being compliant with data...

View resource

Learning to win the talent war: how digital marketing can develop its people

Whitepaper | Digital Marketing

Learning to win the talent war: how digital marketing can develop its peopl...

Learning to win the talent war: how digital market...

This report documents the findings of a Fireside chat held by ClickZ in the first quarter of 2022. It provides expert insight on how companies can ret...

View resource

Engagement To Empowerment - Winning in Today's Experience Economy

Report | Digital Transformation

Engagement To Empowerment - Winning in Today's Experience Economy

Engagement To Empowerment - Winning in Today's Exp...

Customers decide fast, influenced by only 2.5 touchpoints – globally! Make sure your brand shines in those critical moments. Read More...

View resource

Mastering voice search optimization: Talk like a local, rank like a pro

Search Marketing

Mastering voice search optimization: Talk like a local, rank like a pro

1m ClickZ News Staff

Mastering voice search optimization: Talk like a l...

Forget typing, voice search is booming. Businesses need Voice Search Optimization (VSO) to rank for conversational queries and secure top spots in sea...

View article

How to Create Impactful SEO Reports that Drive Business Success

2m ClickZ News Staff

How to Create Impactful SEO Reports that Drive Bus...

Wielding graphs and analytics has its place. But to truly capture executive attention in today’s impatient digital arena, we must step into the shoes ...

View article

How Google's Search Generative Experience (SGE) is Reshaping SEO

2m ClickZ News Staff

How Google's Search Generative Experience (SGE) is...

As the search giant delves deeper into the realm of artificial intelligence (AI), it is clear that SGE will have a profound impact on the future of SE...

View article

The secrets to getting the best SEO traffic without even ranking

11m Daniel Tannenbaum

The secrets to getting the best SEO traffic withou...

Did you know that there are ways to get to the top of Google without ranking your own site? You can still get lots of good organic traffic using alter...

View article

How SEO is changing because of ChatGPT

11m Daniel Tannenbaum

How SEO is changing because of ChatGPT

When ChatGPT was introduced in 2022, it changed the internet. Today, we speak to some startups and experts to understand how ChatGPT is changing SEO R...

View article

Winning at search: why vigilance and strategy alignment are necessary evils

Data-Driven Marketing

Winning at search: why vigilance and strategy alignment are necessary evils

11m Prasanna Dhungel

Winning at search: why vigilance and strategy alig...

As brands and agencies struggle to prioritize visibility of ever-changing SERP features, here's how they can build effective, holistic search strategi...

View article

What role does page speed play for SEO?

SEO

What role does page speed play for SEO?

1y DebugBear

What role does page speed play for SEO?

Page speed has been a ranking factor for a long time, but it has increased in importance over the last two years. Learn about Google’s Core Web Vitals...

View article

iOS 14 uncovers measurement vulnerabilities for business

322023

iOS 14 uncovers measurement vulnerabilities for business

1y Jamie Bolton

iOS 14 uncovers measurement vulnerabilities for bu...

How will marketers handle the advertising industry upheaval in regard to data and measurement? Read More...

View article

Follow us

Lies, Lies, and LSI

Subscribe to get your daily business insights

Read the next article

Explore Tech Talks

Whitepapers

Whitepapers

US Mobile Streaming Behavior

US Mobile Streaming Behavior

Winning the Data Game: Digital Analytics Tactics for Media Groups

Winning the Data Game: Digital Analytics Tactics f...

Learning to win the talent war: how digital marketing can develop its peopl...

Learning to win the talent war: how digital market...

Engagement To Empowerment - Winning in Today's Experience Economy

Engagement To Empowerment - Winning in Today's Exp...

Related Articles

Mastering voice search optimization: Talk like a local, rank like a pro

Mastering voice search optimization: Talk like a l...

How to Create Impactful SEO Reports that Drive Business Success

How to Create Impactful SEO Reports that Drive Bus...

How Google's Search Generative Experience (SGE) is Reshaping SEO

How Google's Search Generative Experience (SGE) is...

The secrets to getting the best SEO traffic without even ranking

The secrets to getting the best SEO traffic withou...

How SEO is changing because of ChatGPT

How SEO is changing because of ChatGPT

Winning at search: why vigilance and strategy alignment are necessary evils

Winning at search: why vigilance and strategy alig...

What role does page speed play for SEO?

What role does page speed play for SEO?

iOS 14 uncovers measurement vulnerabilities for business

iOS 14 uncovers measurement vulnerabilities for bu...