Apache2 Rewritemap

Filed Under (Sysadmin) by Amandine on 16-02-2011

I love apache mod_rewrite. I often fight with it, but if you really want it you can do whatever you want. This time I wanted to find a way to call a file named after the called subdomain. I have a website called www.yellow-sub.net, and there is some subdomains like the-beatles.yellow-sub.net, john-lennon.yellow-sub.net and so on for each part of the website (That’s a SEO consideration in the beginning). I have only one virtual host for that website because I find it easier to manage like that, and I didn’t want to put the logic in the files for some reasons (and because working with rewrite rules is fun).

This is what I want :

  • you call -> you get
  • www.yellow-sub.net -> index.php?page=somepage
  • the-beatles.yellow-sub.net -> index.php?page=somepage-the-beatles
  • blahblah.yellow-sub.net -> index.php?page=somepage-blahblah

The page parameter will in fact call a different page with name “somepage-blahblah.html”, and I wanted it to work for every single subdomain (if it doesn’t exist it will anyway be redirected to www home page)

The closest way I found was to use %{HTTP_HOST} in my rewriterule :

RewriteCond %{HTTP_HOST} !^www\.yellow-sub\.net$ [NC]
RewriteRule ^/$ /index.php?page=somepage-%{HTTP_HOST} [L]

But this way I had to name my files somepage-the-beatles.yellow-sub.net.html instead of somepage-the-beatles.html, wich is not so bad but quite annoying when I want to copy that from beta (different domain name) to live or from live to beta.

So I found the Rewritemap parameter in the apache documentation website. And That’s just an amazing new feature I didn’t know about : it will let you do whatever you want with your rewrite rules! I let you read the manual page, It’s well written and I won’t explain something already well explained : See that Rewritemap doc

I used the last MapType : External Rewriting Program that allows you to write your own script in whatever programming language you fancy, taking stdin and writing to stdout. Mine is very simple, in bash :

#!/bin/bash
while read line; do
  echo $line |cut -d. -f1
done;

I guess that’s a good practive to Keep It Simple, Stupid, as advised in the manual, cause it will be running as apache child, not called when needed. You can test it by giving it something to process :

$ echo the-beatles.yellow-sub.net |/etc/apache2/scripts/subdomain.sh
the-beatles
$ /etc/apache2/scripts/subdomain.sh
blabla.yellow-sub.net # this is what I wrote
blabla # this is what he said
hgzsfver.erg.erg.reg.ezefg                  
hgzsfver
^C

To use it in you Virtualhost section :

Rewritemap subdomain prg:/etc/apache2/scripts/subdomain.sh
RewriteCond %{HTTP_HOST} !^www\.yellow-sub\.net$ [NC]
RewriteRule ^/$ /index.php?page=somepage-${subdomain:%{HTTP_HOST}} [L]

Don’t be messy with the interesting part ${subdomain:%{HTTP_HOST}}, I sillily wrote a % instead of the $ and spent half an hour to figure it out! (no errors anywhere, the script was launched, it just didn’t work at all).

Reload apache to watch it running now :

 xxxx ?        Ss     0:00 /usr/sbin/apache2 -k start
xxxxx ?        S      0:00  \_ /usr/sbin/apache2 -k start
xxxxx ?        S      0:00  \_ /usr/sbin/apache2 -k start
xxxxx ?        S      0:08  |   \_ /usr/bin/php5-cgi
xxxxx ?        Ss     0:00  |   \_ /usr/bin/php5-cgi
xxxxx ?        S      0:02  |   |   \_ /usr/bin/php5-cgi
xxxxx ?        S      0:00  \_ /bin/bash /etc/apache2/scripts/subdomain.sh
xxxxx ?        Sl     0:46  \_ /usr/sbin/apache2 -k start
xxxxx ?        Sl     0:48  \_ /usr/sbin/apache2 -k start

You can watch it in action using the Rewritelog :

xxx.xxx.xxx.xxx - - [17/Feb/2011:08:39:50 +0100] [the-beatles.yellow-sub.net/sid#c85408][rid#1017528/initial] (5) map lookup OK: map=subdomain key=the-beatles.yellow-sub.net -> val=the-beatles

And now my files are just called the way I wanted to :)
I have to say, I love this feature more because I can imagine a lot of new functions in my rewriterules than because it solved my problem, for which I had other options anyway (not as sexy ones, but solutions though.)

The question now is : it is safe? What are the risks of using such a script within apache? If you have anything to say about that, please tell me :)

Have fun with mod_rewrite!

Apache2 mod_rewrite and %{REQUEST_FILENAME}

Filed Under (Sysadmin, Tips) by Amandine on 23-02-2010

I’m trying to develop a new website to increase my php object oriented skills. For this new website, I want every request for any url that doesn’t match a actual file on the disk to be redirected to index.php (to handle parameters in fact). Easy with apache2 rewrite rules :

         RewriteCond %{REQUEST_FILENAME} !-f
         RewriteCond %{REQUEST_FILENAME} !-d
         RewriteCond %{REQUEST_FILENAME} !-l                                                                                                                                                                              
         RewriteRule ^/(.*)$         /index.php?rt=$1 [L,QSA]

This means : if the requested file is not a real file, and isn’t a directory, and isn’t a symlink, then redirect to index.php.

I was really surprised to discover that it doesn’t work. Though, everybody seems to use this syntax ! I checked my apache version : Apache/2.2.9 (Debian), nothing special with this one I guess.
To understand what Apache was doing with my rewrites, I activated the rewrite log :

         RewriteLog /var/log/apache2/rewrite.log                                                                                                                                                                      
         RewriteLogLevel 5

Here’s what I got (the interesting part, cause I got a looot more !) :

[blah blah blah] (2) init rewrite engine with requested uri /toto.htm
[blah blah blah] (3) applying pattern '^/(.*)$' to uri '/toto.htm'
[blah blah blah] (4) RewriteCond: input='/toto.htm' pattern='!-f' => matched
[blah blah blah] (4) RewriteCond: input='/toto.htm' pattern='!-d' => matched
[blah blah blah] (4) RewriteCond: input='/toto.htm' pattern='!-l' => matched
[blah blah blah] (2) rewrite '/toto.htm' -> '/index.php?rt=toto.htm'

So apaches verifies only ‘/toto.htm’ and not the whole path for “%{REQUEST_FILENAME}”? I thought though it was the whole path… let’s verify in the doc.
From http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html, by habit (cause I used apache 2.0 a lot more than apache 2.2 from now on) :

REQUEST_FILENAME : The full local filesystem path to the file or script matching the request.

Hmm. But I use apache version 2.2, so what do they say here http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html :

REQUEST_FILENAME : The full local filesystem path to the file or script matching the request, if this has already been determined by the server at the time REQUEST_FILENAME is referenced. Otherwise, such as when used in virtual host context, the same value as REQUEST_URI.

Ow.

REQUEST_URI : The resource requested in the HTTP request line. (In the example above, this would be “/index.html”.)

Ok, I understand, I use virtual hosts (like everybody, uh?), so the real syntax for my needs is :

         RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
         RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d
         RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-l                                                                                                                                                                              
         RewriteRule ^/(.*)$         /index.php?rt=$1 [L,QSA]

This works even if it doubles the “/” between each variable (one / at the end of DOCUMENT_ROOT, and another at the beginning of REQUEST_FILENAME).

Here’s the rewrite log showing that it works :

[blah blah blah] (2) init rewrite engine with requested uri /toto.htm                                                                          
[blah blah blah] (3) applying pattern '^/(.*)$' to uri '/toto.htm'
[blah blah blah] (4) RewriteCond: input='/path/to/documentroot//toto.htm' pattern='!-f' => not-matched
[blah blah blah] (1) pass through /toto.htm

Now I can disable this log if I want to keep space on my disk.

I must admit I read the description for REQUEST_FILENAME in apache2.2 several times before noticing that it was just the answer… too used to read too fast! Thanks to this old post that made me re-read slower ! ;)