Ever wonder how much traffic you’re losing simply because people are clicking on outdated URLs to your site? Well, there’s an easy way to find out if your hosting service gives you access to the directory where your raw log data is stored.
There’s usually a file in there called the Error Log. Every time your web server encounters an error, it writes what the error was in this log. Now the thing you need to be looking for is a statement that says: “File does not exist:” followed by the path to the file that doesn’t exist.
These errors show up in your main log file, too. They are called 404 errors, and they look like this:
cache-rc09.proxy.aol.com – – [10/Jun/2000:19:05:34 +0000] “GET /somefile.html HTTP/1.0” 404 8733 “-” “Mozilla/4.0 (compatible; MSIE 5.0; MSNIA; AOL 5.0; Windows 98; DigExt)”
The 404 (in bold) is the transaction code for “file not found.” If your hosting service gives you traffic reports, the number of 404 errors is usually included in that report.
OK, so you’re getting a lot of 404 errors. How can you fix the situation? Well, you could try to find all those bad links out there on other people’s sites and then write those people a nice email begging them to correct the URL. But the chance that they’ll do it or that you’ll even catch all of the bad links is just slightly less than Hell freezing over. So it is best to take matters in your own hands and correct the problem on your end. And you do that, in geek parlance, by editing your “dot htaccess” file.
Your “dot htaccess” file (written as: .htaccess note the period in front of the “h”) is a text file in your root directory (the directory where your index.html file is). If it isn’t there, you can easily make one. Your .htaccess file basically tells the server: “Hey, server. If a page request comes in that looks like this, do this instead.”
OK, this next part is very important, so read it twice: Do not edit your .htaccess file with anything other than a text editor. Let me write that again: DO NOT edit your .htaccess file with anything other than a text editor.
The reason is that the “do this if the page request looks like this” command needs to be on a single line. The downside of word processors is they like to put line breaks in lines of text they think are too long. Text editors, on the contrary, don’t. If you upload a .htaccess file that has line breaks within a command line, you’ll totally screw your web server to the point where it will crash.
For the Windows folks, a good built-in text editor is Notepad, though I prefer to use a program called UltraEdit. For the Mac folks, BBEdit is a solid text editor.
OK, so let’s get to the commands.
If you want to redirect people to a standard message when they try to access a web page that is no longer on your server, put this in:
ErrorDocument 404 http://THE.FILE.YOU.WANT.TO.SEND.THEM.TO
Here’s an example from Booklocker.com:
ErrorDocument 404 http://www.booklocker.com/notfound.html
This tells our server: “Hey, if you get a 404 error, send people to http://www.booklocker.com/notfound.html.”
On that page I have a message explaining the link they clicked on doesn’t work anymore, but here are some ways to find what they were looking for. I include links to our category page, search engine, and main page, and an email address they can write to for help.
The above solution covers bad links you don’t know about. But what if you do know the bad URL and have an updated URL you want to send people to? Well, put in this command:
Redirect PATH TO OLD FILE url to new file
Example:
Redirect authors/jmclain2.html http://www.booklocker.com/jmclain.html
This tells the server: “If you get a request for jmclain2.html in the /authors/ directory, pull the page at http://www.booklocker.com/jmclain.html instead.”
Remember, make sure individual commands you write to your .htaccess file are each on one line with no line breaks within them. And put your .htaccess file in your root directory.
Follow those two rules, and you’ll never lose a web site visitor to a bad link again.