Friday 24th of March 2017 11:02:30 AM

Apache .htaccess Examples :: Rewriting and Redirecting with mod_rewrite (and mod_alias)

PURPOSE AND BACKGROUND

I heavily rely on Apache's mod_rewrite module in my site's .htaccess file to do a number of helpful things. In order to learn enough about mod_rewrite, I studied a number of examples on the web. Some of those examples were very hard to find. As the official mod_rewrite documentation will tell you, the use of these modules can be a black magic art. I'm hoping that these examples will help someone else acquire a little bit of that black magic.

First, be sure to see the official Apache documentation, which is mirrored all over the web. In particular, review these sites:

  • Apache 1.3 URL Rewriting Guide
    • There are a LOT of good examples here. You should always start here for some textbook helpful examples of, in some cases, some complicated and useful rewriting code.

  • Module mod_rewrite
    • This is an extremely important document. There are a lot of nuances built into this document that are often quickly overlooked because the author decided to only spend a quick sentence on something very important. Pay attention to every detail of this document.

  • Module mod_alias
    • mod_alias is hardly as complex as mod_rewrite, but it's equally as important. Much of my .htaccess file could be rewritten much more simply using mod_alias. The only reason I lean so much on mod_rewrite is that mod_alias recurses down every subdirectory of mine, which includes my subdomains. Thus, if I use mod_alias directives, redirections I want on my main site show up on all of my subdomains as well. This is not desirable. I solve this problem by rewriting all of my mod_alias statements with mod_rewrite directives; mod_rewrite directives do not recurse down subdirectories and subdomains. If it weren't for my subdomains, I'd use mod_alias much more. Everyone should have a solid understanding of this module.

  • Module mod_asis
    • This is an honorable mention. My .htaccess sends the 403 Forbidden for a number of different very specific reasons. I should have used mod_asis to send those custom 403 error messages. Combining mod_asis with mod_alias and/or mod_rewrite gives the ability to build CONDITIONAL ERROR DOCUMENTS. (I leave this as an exercise)
Finally, note that this web page only scratches the surface. With the mod_rewrite directives like the ones involving chaining, passing thru, and skipping, mod_rewrite can turn an .htaccess configuration file into a powerful scripting language.



Some Examples Similar to Lines in My .htaccess File

Contents

Order Matters to mod_rewrite Directives

  • It is important to note that the relative order of the mod_rewrite directives is important.

  • For example, if you are having a problem with a redirect rule that keeps putting information about the real filesystem location in the target URL, try moving that redirect rule earlier in the file.

  • In most cases, if there is no other easy way to determine ordering, it is best to order redirect rules to URLs with explicit hostnames FIRST. This sort of ordering is reflected in the examples given below.

  • The examples below are meant to be taken in order. If I was to put these into an .htaccess file, I would leave them in the same order as is on this page.

Spelling of Referrer is REFERER

  • Remember that it's HTTP_REFERER. This is NOT the correct spelling of the word referrer, but it IS the correct spelling of the server variable.

Difference Between Redirecting with mod_rewrite and mod_alias

  • These next two blocks may appear to be equivalent, but they have at least one major difference.
RewriteRule ^servo.php$ http://www.tedpavlic.com/post_servo.php [R=permanent,L]
RewriteRule ^images($|/.*$) http://links.tedpavlic.com/images$1 [R=permanent,L]
Redirect permanent /servo.php http://www.tedpavlic.com/post_servo.php
RedirectMatch permanent ^/images($|/.*$) http://links.tedpavlic.com/images$1
  • The first block is implemented with mod_rewrite directives.

    Thus, the first block is NOT inherited by other .htaccess files that live in child directories underneath the main directory.

  • The second block is implemented with mod_alias directives.

    Thus, the second block IS INHERITED by other .htaccess files that live in child directories underneath the main directory.

  • In other words, suppose links.tedpavlic.com is a subdomain that is hosted out of a links folder that resides within the main www.tedpavlic.com document root. Suppose that links folder contains its own .htaccess file that makes no mention of either servo.php or images.

    When accessing http://links.tedpavlic.com/servo.php, the SECOND block will redirect this request back to http://www.tedpavlic.com/post_servo.php. However, the FIRST block will return a 404 File Not Found.

    When accessing http://links.tedpavlic.com/images, the SECOND block will redirect this request back to http://links.tedpavlic.com/images, which results in a redirect loop. However, the second block will return a 404 File Not Found.

  • mod_alias rules ride along top of the directory structure, regardless of the public structure of the web site and its subdomains. mod_rewrite rules are completely forgotten when a new .htaccess is found in a subdirectory.

    For my site, because of my subdomains, that means that the mod_rewrite was best for me. This may not be the case with your site.

Important Options

Options -Indexes +Includes +FollowSymLinks
  • -Indexes: I include this here to remind you that you are in control of your web site. If you don't like the way the webserver displays your very important content then change it. Rewrite it. Change how the webserver interprets requests. -Indexes to me is a symbol of control.

  • +Includes: This is more of a reminder to use .shtml files for your error documents (if you don't want to use error scripts). This will help you return good information to your users that they may be able to return to you incase they find a bug in your rules.

  • +FollowSymLinks: This is the important one. When using .htaccess mod_rewrite rewriting, this is required.

Turn the Engine On

RewriteEngine On
  • This is just a simple reminder that mod_rewrite needs to be turned on.

Redirect to Most Desirable Hostname or Subdomain

RewriteCond %{HTTP_HOST} !^www\.tedpavlic\.com$ [NC]
RewriteCond %{HTTP_HOST} !^$
RewriteCond %{REQUEST_URI} ^($|/.*$)
RewriteRule ^.* http://www.tedpavlic.com%1 [R=permanent,L]
  • My websites often have many aliases. These aliases are provided so I have some flexibility when I want to develop new content. These aliases are also provided so users have an easy way to remember my sites. However, regardless of user preference, I really want them to end up at one particular site. I also want search engines to only index ONE of those sites.

  • Note the use of the %1 rather than the typical $1. A %1 will match a group found in one of the RewriteCond statements. In this case, I'm picking off the whole REQUEST_URI so I can resubmit it to the subdomain. Note that I could have gotten rid of that third RewriteCond and done the match entirely in the RewriteRule line and used $1 instead. However, to keep consistency with my subdomains, I show it like this. This also avoids confusion with how the match works when the actual domain is found in the target. See mod_rewrite documentation and further information below for more details.

  • Notice the R=permanent. Not only does this rule rewrite the URL, but it issues a 301 permanent redirection. This should convince webbots to update their records to point to the central site.

  • Notice the L rewrite flag indicating that this is the last rule to be processed on this pass. Wait for the browser to continue the redirect. Then continue processing on the NEW URL. This simplifies rewriting rules later. This is the reason why I have this rule so early in my .htaccess file!!

  • Notice that the second line of this rule makes sure it does NOT apply when there is an empty HTTP_HOST variable. Browsers using older versions of the HTTP protocol may result in HTTP_HOST being empty. Let these users through without the redirect. Otherwise, you will put them in a deadly redirect loop. That's bad.

  • Note that when the explicit site hostname is given, in the target URL, the RewriteRule is interpretted differently and matches against a slightly different string. See mod_rewrite documentation for more information about this. This distinction is not important in this rule because I chose to match on REQUEST_URI instead. I only chose to do this because it is necessary for me to do this within subdirectories that host my subdomains. (see below)

  • The following is the very similar RewriteRule block I use on each of my subdomains that lie inside subdirectories of my main site. Depending on what sort of redirect you are trying to do, this may be a better choice for you.
RewriteCond %{HTTP_HOST} !^links\.tedpavlic\.com$ [NC]
RewriteCond %{HTTP_HOST} !^$
RewriteCond %{REQUEST_URI} ^/links($|/.*$)
RewriteRule ^.* http://links.tedpavlic.com%1 [R=permanent,L]