Large Sites, Small Headaches

Don't let traffic spikes and spammers impede search engine optimization.

Author

Date published September 19, 2007 Categories

An agency-side SEO (define) professional’s dream is to remove the thorn in a site’s paw — that single, hidden problem that, once fixed, opens the floodgate of search traffic. Sadly, this doesn’t happen often. Instead, it’s usually a slow climb up the graph, built on constant content production and spot-checking of architecture and results pages.

I recently examined the damage caused when site owners fail to carefully watch stray domains. This column investigates large site maintenance issues that may not earn a passing thought amid day-to-day demands of most search engine marketers, but if left unattended, they can do slow, often indiscernible damage.

Balancing the Load Correctly

Depending on the type of site you run, traffic demands may be wildly unpredictable. A big story picked up by a popular social media site or spread virally can spike site traffic, leaving you unprepared. To address this possibility, many sites purchase load-balancing solutions that charge for overflow traffic only when used, instead of paying 100 percent of a bandwidth cost when they need it only 5 percent of the time.

Load balancers often take excess traffic demand and spill them over to servers called www2, www3, and so on. While this is a great solution for human visitors, search engines often end up on these servers, crawling indefinitely if the site uses relative URLs for navigation. A quick search for “www2” shows the unintended consequences of load balancing: thousands of URLs indexed on load-balancing subdomains that would be better left invisible to search engines.

To determine if this is your site’s problem, find out whether the site uses load-balancing measures for excessive traffic periods. If it does, investigate what the servers are called. Perform “site:” queries for those subdomains to see if and how many pages are indexed across your load-balancing servers.

Just as engines recommend stripping session IDs from URLs served to robots, rampant indexing of load-balanced content often calls for similar measures. If necessary, use user-agent detection to ensure robots always get content on the main (“www”) server, and that only humans get sent to www2 and beyond.

While 301 redirects (define) might sound enticing at first, you should probably resist the temptation unless you really know what you’re doing. If you redirect a user (or a bot) from www2 to www, but server demand is still high enough to trigger load-balancing, you might cause an endless loop that convinces visitors, both human and electronic, to quickly abandon the site.

GET to Your POST

Of all discussions about the importance of links, probably 98 percent focus on links coming to your Web site, as opposed to the links that point from your site to someone else’s. Did you know that under a certain set of conditions spammers can exploit your on-site search function to build links from your site to theirs?

About a year ago, we looked at a client whose Yahoo index counts began to skyrocket, far surpassing the number of pages that legitimately existed on the site. As we poured through Yahoo Site Explorer to see what was going on, we saw thousands of junk pages masquerading as search results from the site’s internal search feature. We soon learned the site’s search function used the GET method (define) of form handling but failed to strip HTML out of queries.

Problem is GET forms typically include the query string in the resulting URL, such as “/results.asp?query=’free+ringtones'”. On the other hand, POST forms (define) typically result in a single URL, such as “/results.asp.”

Spammers had attacked our client’s site and input search queries that contained the HTML required to link back to their sites. For example, one indexed search results page began, “Following are the results of your search for ‘free ringtones,'” with “free ringtones” linking out to the spammer’s site. Then, the spammers had other sites in their network link to the search results page on the client’s site, which resulted in the page getting crawled and indexed by Yahoo. So Yahoo saw thousands of pages on the client site linking to spammy sites. Needless to say, these sites didn’t fit the client’s criteria for link-worthy sites.

Several safeguards exist for this problem. First, exclude search results pages from spider crawls through your robots.txt file. If you don’t want to do that, strip HTML from the text accepted in search fields. And finally, explore converting your form method from GET to POST to avoid distinct URLs that engines can index.

Conclusion

Persistence is the hardest SEO technique to master. But it enables you to finely tune a site over time and avoid some of the problems that plague less experienced Webmasters.

Want more search information? ClickZ SEM Archives contain all our search columns, organized by topic.

Subscribe to get your daily business insights

More about:

Read the next article

Explore Tech Talks

Lucy

Lucy helps organizations leverage knowledge for in... View Tech Talk
TVSquared

TVSquared is the global leader in cross-platform T... View Tech Talk
Grata

Grata is a B2B search engine for discovering small... View Tech Talk

Whitepapers

US Mobile Streaming Behavior

Whitepaper | Mobile

US Mobile Streaming Behavior

Streaming has become a staple of US media-viewing habits. Streaming video, however, still comes with a variety of pesky frustrations that viewers are ...

View resource

Winning the Data Game: Digital Analytics Tactics for Media Groups

Whitepaper | Analyzing Customer Data

Winning the Data Game: Digital Analytics Tactics for Media Groups

Winning the Data Game: Digital Analytics Tactics f...

Data is the lifeblood of so many companies today. You need more of it, all of which at higher quality, and all the meanwhile being compliant with data...

View resource

Learning to win the talent war: how digital marketing can develop its people

Whitepaper | Digital Marketing

Learning to win the talent war: how digital marketing can develop its peopl...

Learning to win the talent war: how digital market...

This report documents the findings of a Fireside chat held by ClickZ in the first quarter of 2022. It provides expert insight on how companies can ret...

View resource

Engagement To Empowerment - Winning in Today's Experience Economy

Report | Digital Transformation

Engagement To Empowerment - Winning in Today's Experience Economy

Engagement To Empowerment - Winning in Today's Exp...

Customers decide fast, influenced by only 2.5 touchpoints – globally! Make sure your brand shines in those critical moments. Read More...

View resource

Mastering voice search optimization: Talk like a local, rank like a pro

Search Marketing

Mastering voice search optimization: Talk like a local, rank like a pro

1m Idris Nagri

Mastering voice search optimization: Talk like a l...

Forget typing, voice search is booming. Businesses need Voice Search Optimization (VSO) to rank for conversational queries and secure top spots in sea...

View article

How to Create Impactful SEO Reports that Drive Business Success

2m Idris Nagri

How to Create Impactful SEO Reports that Drive Bus...

Wielding graphs and analytics has its place. But to truly capture executive attention in today’s impatient digital arena, we must step into the shoes ...

View article

How Google's Search Generative Experience (SGE) is Reshaping SEO

2m Idris Nagri

How Google's Search Generative Experience (SGE) is...

As the search giant delves deeper into the realm of artificial intelligence (AI), it is clear that SGE will have a profound impact on the future of SE...

View article

The secrets to getting the best SEO traffic without even ranking

10m Daniel Tannenbaum

The secrets to getting the best SEO traffic withou...

Did you know that there are ways to get to the top of Google without ranking your own site? You can still get lots of good organic traffic using alter...

View article

How SEO is changing because of ChatGPT

10m Daniel Tannenbaum

How SEO is changing because of ChatGPT

When ChatGPT was introduced in 2022, it changed the internet. Today, we speak to some startups and experts to understand how ChatGPT is changing SEO R...

View article

Winning at search: why vigilance and strategy alignment are necessary evils

Data-Driven Marketing

Winning at search: why vigilance and strategy alignment are necessary evils

11m Prasanna Dhungel

Winning at search: why vigilance and strategy alig...

As brands and agencies struggle to prioritize visibility of ever-changing SERP features, here's how they can build effective, holistic search strategi...

View article

What role does page speed play for SEO?

SEO

What role does page speed play for SEO?

1y DebugBear

What role does page speed play for SEO?

Page speed has been a ranking factor for a long time, but it has increased in importance over the last two years. Learn about Google’s Core Web Vitals...

View article

iOS 14 uncovers measurement vulnerabilities for business

322023

iOS 14 uncovers measurement vulnerabilities for business

1y Idris Nagri

iOS 14 uncovers measurement vulnerabilities for bu...

How will marketers handle the advertising industry upheaval in regard to data and measurement? Read More...

View article

Follow us

Large Sites, Small Headaches

Subscribe to get your daily business insights

Read the next article

Explore Tech Talks

Whitepapers

Whitepapers

US Mobile Streaming Behavior

US Mobile Streaming Behavior

Winning the Data Game: Digital Analytics Tactics for Media Groups

Winning the Data Game: Digital Analytics Tactics f...

Learning to win the talent war: how digital marketing can develop its peopl...

Learning to win the talent war: how digital market...

Engagement To Empowerment - Winning in Today's Experience Economy

Engagement To Empowerment - Winning in Today's Exp...

Related Articles

Mastering voice search optimization: Talk like a local, rank like a pro

Mastering voice search optimization: Talk like a l...

How to Create Impactful SEO Reports that Drive Business Success

How to Create Impactful SEO Reports that Drive Bus...

How Google's Search Generative Experience (SGE) is Reshaping SEO

How Google's Search Generative Experience (SGE) is...

The secrets to getting the best SEO traffic without even ranking

The secrets to getting the best SEO traffic withou...

How SEO is changing because of ChatGPT

How SEO is changing because of ChatGPT

Winning at search: why vigilance and strategy alignment are necessary evils

Winning at search: why vigilance and strategy alig...

What role does page speed play for SEO?

What role does page speed play for SEO?

iOS 14 uncovers measurement vulnerabilities for business

iOS 14 uncovers measurement vulnerabilities for bu...