Adversarial Information Retrieval — That’s Spam to You and Me

As I’ve said before, one man’s spam can often be another’s cutting-edge SEO (define) tactic. No, I don’t condone spam in the true black sense of redirecting an innocent end-user query to an adult-related site or something like that. But you have to admit, in the highly competitive field of SEM (define)/SEO, some tactical ingenuity gets tarred and feathered, when perhaps it should be applauded.

I have a lot of friends and acquaintances in the SEO arena who are candid in their disregard for search engine guidelines. After all, these are guidelines, not laws. Many SEO firms loudly blow their trumpet about doing ethical SEO. But who can really say that going beyond search engine guidelines is unethical?

Being ethical relates to moral conduct. If someone busts a search engine guideline here and there, is that immoral? Let’s face it, if an innocent end user keys in a search term at any search engine, and that search engine returns a totally relevant result, regardless of how it got there, what could be unethical about that? As long as the end user experience is good, it seems like a pretty happy-all-round scenario to me.

Bear with me for a moment. Having worked in conventional media/marketing before the Web arrived, I recall tactics we used in print and broadcast media. Where we had more marketing muscle (dollars) than our clients’ competitors, we’d frequently blanket-buy media to keep them darn near invisible. Ethical? Moral? Or just tactical?

In my opinion, the search engine spam debate is similar to the e-mail spam debate. I don’t know about you, but I get two kinds of e-mail spam: good spam and bad spam. True, most spam I get is rubbish I have no interest in. But every now and again, I get the odd unsolicited piece of e-mail in my inbox and think, “Now that’s interesting” (and, no, I’m not talking e-mail of the appendage-enlargement type).

It turns out that, according to a Pew Internet study from earlier this year, although people are getting more spam in their inboxes than ever before, they’re also less bothered by it. However, overall, people have less trust in e-mail because of spam.

Does the same thing apply on the Web? One could argue that the presence of spam in search engine results could have the similar effect of people losing trust. But that could only happen if they actually knew it was spam. What if we go back to my earlier scenario about totally relevant spam. How could the end user ever be aware that, to a search engine, this is a bogus page, if, in fact, to the end user it’s exactly what she was expecting to see?

A Microsoft research paper last year suggested that 13.8 percent of English-language Web pages in its study were classified as spam. (Being the person who coined the black-hat/white-hat terminology, it made me smile to see it being used in a scientific research paper.)

To the average end user, search engine spam is a little like Google’s PageRank. But ask the average end user if she knows what PageRank is, and she won’t have a clue. Ask her if she’s ever seen a spammy result at a search engine, and you’d probably get the same head-scratching response.

It’s probably safe to say that just about anyone who ever used a search engine has been served up some irrelevant results. But a lot of that depends on the searcher’s skills. The interesting thing is, most people I know who bend the search engine’s guidelines are serving up totally relevant pages. They’re crafted to get the end user into some sort of transaction. And that’s usually likely to happen only if the end user lands on a quality page.

I did a little searching in the online gambling space for a very competitive keyword. The number-one result showed a page size (weight) of 17K. (Most search engines show the weight of a page below the snippet.) I pulled the page down and checked it myself, and it was actually 53K. Cloaked? You bet (no pun intended). But would the page served up satisfy the end user? It surely would.

I’ve been in this business for a very long time, so I can spot these things a mile away. But how would a happy-go-lucky online gambler ever know the difference? And if it comes to that, if the content is exactly what he was looking for, would he even care?

Keep in mind, I’m only thinking aloud. Don’t head off to the nearest spam software site because you think I gave it the OK. I certainly didn’t. And I certainly don’t want Matt Cutts poking me in the eye with his pencil at the next conference.

Instead, the topic simply came to mind after reading the essay, “Adversarial Information Retrieval: The Manipulation of Web Content.” That’s spam to you and me.

Mike is off this week. Today’s column ran earlier on ClickZ.

Want more search information? ClickZ SEM Archives contain all our search columns, organized by topic.

Related reading

Brand Top Level Domains