Scanning forum posts, keyword research, and queries leading to visits to our own site, I still see a lot of confusion about the proper protocol to use when significantly restructuring a Web site. It’s easy to find basic advice, but complex redesigns are more and more common and require information much more intricate than “redirect it properly.” My goal here is to assume you know the basics and provide some additional tips for redesign that might otherwise be overlooked.
If your redesign amounts to a platform change, such as moving from HTML to PHP (define), with your existing URLs all remaining intact but changing their file extensions, congratulations. Your redirection issues will be relatively minor and you can simply use a universal instruction to redirect/filename.oldextension to /filename.newextension. The 301 header code tells engines the new URL has replaced the site’s old URL, so the new URL should replace the old one in the engine’s index as well.
When you toss in a global revision of content and site structure, things get much more complicated. The conventional redirection advice, that you use a 301 (permanent) redirect from old URLs to new ones, is sound. I’ve recommended this many times and am always satisfied with the results. But it’s not an adequate answer to many subsequent questions.
Suppose your old site had a page devoted to reviews of Product X. In the new site, you decided to build out the content of the main Product X page and incorporate reviews on that same page. Because the Product X reviews page has no direct counterpart in the new architecture scheme, it’s reasonable to redirect the old Product X reviews page to the new main Product X page.
Engines are much faster at processing directives like this than ever before. Assuming your redirects are set up correctly, you’ll frequently see new URLs show up in SERPs (define) within a couple days of your new site launch.
One thing to remember about 301 redirects is that while the index transfer from the old URL to the new one is permanent, the associated rankings may not be. Initially, after you implement a 301 redirect, the new URL will likely rank for the same things the old one did. But over time, the new URL’s content and incoming links must reflect similar or improved content to retain rankings.
XML Sitemap Feeds
With a large shift from one group of URLs to another, be zealous about finding ways to get them indexed as quickly as possible. During a redesign in which significant architectural changes occur and URL structure changes, it’s a good idea to populate your XML sitemap feed with both old and new URL lists for at least a month or two. Theoretically, you could keep your old URLs in the sitemap feed forever, but once you see the new URLs consistently showing up in all major engines for a few months, you’re better off pulling the old URLs out of the XML feed to save bandwidth and server resources, and to let the spiders spend their limited time more productively — on your actual pages.
Note that dual-listing URLs in your XML feed is a supplement to a solid 301 strategy, not a replacement for one. It simply provides robots with one more way to notice the redirects you’ve placed on old URLs as well as a direct route to the new URLs on your site.
Another caveat is this technique is likely invalid if you’re redirecting content from one subdomain to another. A given sitemap feed must include URLs only on that specific host. In other words, if you’re shifting new URLs from www.domain.com to www.newsub.domain.com, the sitemap file that sits on www.domain.com cannot contain URLs from www.newsub.domain.com. In that case, the best route is to quickly generate a new feed exclusively for www.newsub.domain.com.
A way to bolster the effects of your XML sitemap feed is to be sure to place its location in your robots.txt file. It’s a very simple operation, and requires only one line of code: sitemap: http://www.domain.com/sitemap.xml. “Domain.com” is your own domain, and “sitemap.xml” is the correct name of your sitemap file. Remember, there’s no right or wrong location within the robots.txt file to place this line.
A final but important note about the role of robots.txt in a redesign: be sure to include both old and new exclusions during a rollover period. Suppose you want to exclude your site’s search results pages from being indexed. If your old search results pages would be excluded with this line: disallow: /search/, but your new search results pages reside in this line: /cgi/results/, then beginning at relaunch, you should have both of these lines in your robots.txt file.
Do this because during the transition time from one set of URLs to another, you must be sure both old and new URLs are excluded until the old ones no longer exist in the index.
The Bottom Line
Relaunching a site is a difficult practice often mired in tedious details. But each hour spent in preparation often saves a day or even a week in retrofitting or fixing errors. It’s true you should build sites primarily for users, not for engines. But during a redesign, you really need to think about the bots.
Want more search information? ClickZ SEM Archives contain all our search columns, organized by topic.
Online consumers with intent to purchase only find what they’re looking for in 50% of ecommerce searches. That needs to change. eBay ... read more
Update: Google’s Rudy Galfi, Google’s lead product manager for AMP, has revealed to Greg Sterling from Search Engine Land that the global rollout of ... read more
Three years ago, Mark Knowles wrote a thorough checklist for testing a website prior to its live launch. It was a very ... read more
Sridhar Ramaswamy, Google’s SVP of Ads & Commerce made announcements about two new products this morning at DMEXCO 2016. The first centred on ... read more