18 Jan 2012

The Ultimate How-To Guide on 301 Redirects

Many of our clients comment on our site review reports on subjects like canonicalization, site migration, architecture changes, or page updates. Specifically when we advise implementing 301 redirects to optimize and maintain the value of a specific URL in search engines, inevitably they reply with the following question: “How do you implement a 301 redirect?”

As long as you have access to the root directory of your web site on your web server, it’s not hard to do. We’re going to assume your site is running an Apache web server (most of the world is), so let’s have a little tech talk on implementing 301s in Apache.


First, a little background

A redirect, at least in terms of websites, is a way of automatically transferring the end user from one URL to another. While there are several technical methods of implementing redirects, for search engine optimization (SEO) purposes, we advise using an HTTP 301 permanent redirect.

Unlike its counterpart, the default HTTP 302 temporary redirect, the 301 indicates the old, linked URL is no longer in use (whereas the 302 indicates the old URL is merely offline temporarily, but is expected to return in the future).

The differentiation between the 301 and the 302 is most important for search engine crawlers. When the crawler encounters a link to your site whose URL is configured with a permanent redirect, your web server (Apache) responds with a 301 status code and then redirects the crawler to the new URL. The search engine not only accepts the redirect to the new URL, but then also begins the process of transferring any existing page rank value from the old URL to the new one in the redirect (it’s that second part of that process that’s so important for SEO, and that’s not done with a 302).

The transferal of page rank value between URLs in the search engine index is not an overnight process, but rest assured, if you’ve done your 301 homework correctly, your site updates will soon be reflected in the search engine results pages (SERPs) without a hitch.

Comparing 301 redirects to rel=”canonical” tags

Some folks know that all of the major search engines announced in early 2009 their planned support for a new HTML tag attribute. Within the <link> tag (found in the <head> section of the page code), they promised support for the new attribute “rel”, used along with the associated value “canonical”, followed by the standard attribute “href” and a URL for its associated value. For example:

<link rel="canonical" href="http://www.xyz.com/page.html" />

The intention of the rel=canonical tag is to inform the search engine crawlers of the single primary (aka canonical) URL for the page content. This is very helpful when a site uses dynamic attribute variables in its URLs. Dynamic URLs can lead to the indexing of multiple versions of a URL for any given page, resulting in index content duplication (which search engines hate!).

While the use of the <link> tag with the rel=canonical attribute is helpful for a page, it’s not a reliable substitute for 301 redirects. Here are my reasons for this:

  1. For starters, in early 2009 and for a long time afterward, Google was the only search engine fully supporting the use of the rel=canonical tag. This has been improving slowly over time, but there still may be questions about how strongly each search engine accepts and uses the tag’s data. On the other hand, all of the search engines fully support 301 redirects.
  2. Secondly, rel=canonical tags are still regarded by the search engines as hints or suggestions, not firm directives. On the other hand, 301 redirects are firm instructions.
  3. Lastly, unlike 301 redirects, which can be configured in one file for the whole site, each webpage needs its own edit to add the new line of custom code for the rel=canonical tag.

That all said, the use of rel=canonical tags can be beneficial to redirecting blog pages (assuming you have access and the technical skill to edit the blog’s source code for a given page’s <head> tag module). In this edge case, the use of a 301 can mean the loss of the original page’s associated user comments and social shares, which all have their own SEO value.

We recommend that clients consider using rel=canonical tags to help minimize indexed content duplication. Regardless for standard matters of reassigning the function and search index values of old URLs to new, we advise clients stick with tried and true 301 redirects.


How do you implement a 301 in Apache?

An Apache web server can implement 301 redirects through script code modifications to one of two of its text-based configuration files: either .htaccess (for directories representing separate sites on one server) or httpd.conf (at the root of the Apache installation). Typically the .htaccess configuration method is used, so we’ll focus on that here.

There are a number of different circumstances for which implementing a 301 redirect is recommended. The specific code you need to create differs by situation, so let’s cover how to implement a 301 for each possible scenario.

Start your rewrite engines

The first thing you need to do is open the text-based file called .htaccess found in the root directory of your site on the Apache web server. Be sure you only do so with a plain text editor application, such as Notepad on Windows-based computers.

Once open, before you add any scenario-specific custom code, you need to do two things:

  1. Enable the Apache mod_rewrite module.
  2. Enable the ReWriteEngine in the mod_rewrite module.

Add the following two lines of code to do this:

Options +FollowSymLinks
RewriteEngine on

Note that this pair of lines is only needed once in the .htaccess file. Once they are  added, you are ready to add the custom 301 redirect code for your scenario.

The following redirect scenarios use placeholder data as file, directory, and domain names in the sample code. Of course, be sure to substitute your own site data! These are not explicit code samples!

301 a single page URL to another

To create a 301 redirect from one URL to another URL, add the following line of code:

Redirect 301 /retiredpage.html http://www.xyz.com/newpage.html

You can add as many of these redirect lines as necessary to the .htaccess file.

301 a directory URL and all of its contents to another

If you have redesigned your site architecture and renamed a directory, you need to create a 301 for the entire directory. Here’s how:

RedirectMatch 301 ^/oldname/ http://www.xyz.com/newname/

301 a domain name URL to another

If you just bought an aged domain name whose traffic (and search page rank value) you want to use to augment that of your existing site’s domain, you can set up a 301 to transfer all traffic and ranking from the purchased domain name to your current site. Use the following code as an example:

RedirectMatch 301 ^(.*)$ http://www.xyz.com

Be sure you set up this redirect code in the .htaccess file of the source site you want redirected, not the redirect target site!

301 domain name URL variants for canonicalization

Since search engines index URLs, having multiple URLs in the index that point to the same content page divides the available page rank credit for that page among those URLs. This is definitely a “not optimized for search” state of affairs! To learn more about the details of canonicalization, take a look at the Search Engine Land post Why Canonicalization Matters From A Linking Perspective. The bottom line is you want to consolidate the page rank to one (canonical) URL to optimize the search value of that content.

Once you understand canonicalization best practices, you’ll want to implement them on your site. That means you must account for all redirecting possible alternative URL variations to the canonical URL. Use the following code sample for your site’s home page:

RewriteCond %{HTTP_HOST} ^xyz\.com [NC]
RewriteRule ^(.*)$ http://www.xyz.com/$1 [L,R=301]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*(default|index)\.(html|php|htm)\ HTTP/ [NC]
RewriteRule ^(([^/]+/)*)(default|main|index)\.(html|php|htm)$ http://www.xyz.com/$1 [L,R=301]

The first two-line block of code redirects URLs that have omitted the “www.” prefix to the full “www.xyz.com” home page URL. That means the home page URL http://xyz.com will not resolve on its own, but instead will redirect to http://www.xyz.com/.

The second code block redirects URLs specifying default page references to the URL that omits default page reference names. This code ensures that any home page URL that includes several versions of explicit page name references, such as default.htm or index.html, will be redirected to the canonical home page URL, http://www.xyz.com/.


Documenting and testing your work

We strongly recommend that you add documentation lines to your coding work in the .htaccess file. To do so, simply start a line in the left margin using the # character, such as in the following code example:

# Redirect this entire domain, abc.com, to the domain xyz.com
RedirectMatch 301 ^(.*)$ http://www.xyz.com

Good documentation will always help you (and those who will have to maintain the site after you leave!) understand what the code underneath was intended to do. This information will help with troubleshooting when things don’t work as expected or changes have to be made.

Of course, you should test your changes as well. Once you’ve used FTP to upload the revised .htaccess file to the root directory of your site, it is ready to go. Use your browser to attempt to access the URL of the page for which you’ve set up the redirect. It should immediately redirect to the new URL as expected. You may also want to test the redirect with online tools such as Redirect Check SEO Tool or perhaps better yet, the Internet Marketing Ninjas Header Checker tool, which not only follows all of the redirects for a given URL and reports the HTTP status codes returned, the tool also provides HTTP status code definitions. Sweet!


Troubleshooting problems

If the redirects are not working as expected, it’s time for troubleshooting.

First off, if the redirect coding in your .htaccess file is valid but it’s not working, check the installation status of the mod_rewrite extension in Apache. This module is typically installed by default, but if it’s not there, the .htaccess coding examples provided above won’t work. Also ensure you added the two lines of code that enable both the mod_rewrite module and the ReWriteEngine.

Also, note that the use of [NC] in RewriteCond lines indicate that the input values are not case specific. If you omit this, URLs in all capital or mixed case letters may not work as expected. Additionally, the use of L in the code [L,R=301] tells the file interpreter engine that this is the last line of code processed for matching input. If you have conflicting statements in your .htaccess code, use the L code in the line that takes precedence.

Lastly, note that stacking up 301s over time is not a great strategy for website health. 301s are very useful, but you’ll want to update your site’s internal links to the correct URLs. Ignoring your old intra-site links in favor of simply using 301 after 301 after 301 only slows down your webpage load time, which is bad enough for SEO. And if the redirects are stacked deep enough, search crawlers may not follow them all. If that happens, it’ll damage your site’s ability to stay in the search index. Also, when you are updating your site links, you should also ensure your sitemap.xml file is updated with the newest URLs.

Taking control of your site’s URLs with permanent 301 redirects is a standard best practice in white hat SEO. Make sure you let the search engines know how to get to your pages while ensuring any earned page rank value is invested in the active pages on your site.