Running Drupal with Clean URL on Nginx or Lighttpd

Tagged in

Drupal Logo First of all, my previous post on Nginx vs. Lighttpd for a small VPS seems to be a “hit”. Thanks to Glenn for submitting it to programming.reddit.com, which got picked up by a few bloggers. Got around 1,000 unique visitors on the day where the post went live, which I guess is pretty good for this one-post-a-week site of mine.

Since I have briefly touched on the “URL rewrite-ability” of Nginx and Lighttpd in my previous post, I think it might actually be useful to have some examples showing how rewrite rules are written on these web servers to support clean URLs. I will take the open source CMS Drupal for example, as it is what Hosting Fu runs on. Btw, Drupal 5.0 has just been released and it rocks.

Prerequisite

These are the things that I assume you would know before reading this article.

  • Why clean URL. Pick your reason. SEO or general dislike of query string.
  • Setting up PHP. I won’t be talking about how to set up PHP/FastCGI on Nginx or Lighttpd. Here’s one for Nginx and one for Lighttpd.
  • Installing Drupal. Check related section in Drupal handbook.

I am only going to discuss the rewriting rules needed to enable clean URL in Drupal on either Nginx or Lighttpd.

Apache’s Mod_Rewrite

Like most open source PHP applications, Drupal came with a .htaccess file assuming Apache is serving the pages. We will use it as the reference on how the rewrite rules can be written for the other two web servers.

Here’s the bit in .htaccess that does rewrites:

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]

What it does is:

  • If requested file exists, serve it.
  • If requested directory exists, serve it depending on how index option is configured.
  • Otherwise, send all requests to index.php, setting parameter ‘q’ as the path of the original request, and then append the rest of the query string.

Simple. Now let’s see how you can do the same with the other two web servers.

Lighttpd

This site runs on Lighttpd 1.4.13, and there seems to be a lot of different ways to do clean URL with Drupal on Lighttpd. Although it came with its own “mod_rewrite”, but I found the functionality is very limited. The biggest problem I found is the lack of conditional rules to check whether a file or directory already exists. At the end you have to either (1) make lots of exceptions in Lighttpd’s rewrite rules, or (2) modify Drupal so it works better with Lighttpd.

I went with the second option.

Well, here are the steps.

  1. After Drupal has been installed and tested with clean URL disabled, add the following rules to lighty’s configuration file:

    server.error-handler-404 = "/index.php";
    

    It basically makes index.php the 404 error handler, so any request that does is not handled by a local file or directory will be sent to Drupal.

  2. Add the following PHP code to Drupal. I just append them to the end of sites/default/settings.php.

    if (strpos($_SERVER['SERVER_SOFTWARE'], 'lighttpd') !== false) {
        $_lighty_url = $base_url.$_SERVER['REQUEST_URI'];
        $_lighty_url = @parse_url($_lighty_url);
    
        if ($_lighty_url['path'] != '/index.php' && $_lighty_url['path'] != '/') {
            $_SERVER['QUERY_STRING'] = $_lighty_url['query'];
            parse_str($_lighty_url['query'], $_lighty_query);
            foreach ($_lighty_query as $key => $val)
                $_GET[$key] = $_REQUEST[$key] = $val;
            $_GET['q'] = $_REQUEST['q'] = substr($_lighty_url['path'], 1);
        }
    }
    

    Let me explain what it does:

    • If we are behind lighty, turn on this hack (I use Nginx for my development box so this code does not apply).
    • Try to parse the REQUEST_URI. If invoked as 404 error handler, we will parse the QUERY_STRING ourselves and copy the values to PHP’s $_GET and $_REQUEST variables.
    • Also set the path bit of REQUEST_URI as query argument q.

    The reason why we have to parse QUERY_STRING is, Lighttpd deliberately does not set QUERY_STRING if FastCGI is invoked as 404 error handler.

  3. Restart lighty, go to Drupal to enable clean URL and see whether it works.

Well, it has been working fine for me, but YMMV.

Nginx

Nginx comes with conditional code for rewrite rules so it is much easier. I basically have the following code in my Nginx configuration file to emulate Apache’s behaviour.

if (!-e $request_filename) {
    rewrite ^/(.*)$ /index.php?q=$1 last;
}

That’s it! Restarting Nginx, and you can now turn on clean URL in Drupal.

However, Nginx is not perfect, and so far I have found one small issue with its rewrite engine. When you use regular expression in Nginx’s rewrite rules, it will try to encode the matches in the replacement URL. So far I have seen it broke Drupal’s search module. For example, if you search for “Hosting Fu”, Drupal will use the following URL:

GET /search/node/Hosting+Fu HTTP/1.1

However, Nginx will rewrite that to:

GET /index.php?q=search/node/Hosting%2BFu HTTP/1.1

Note it encoded ‘+’ to ‘%2B’, which confuses Drupal, who thinks that you are actually searching the phrase ‘Hosting+Fu’. In case of Apache, ‘+’ passed through rewrite rules untouched.

Conclusion

Many open source PHP applications that I have experience with always assume the existence of Apache, and provide clean URL to only Apache users. On the other hand, developers of other frameworks like Ruby on Rails, Django, Webpy, etc take clean URL for granted because it is something handled right inside the framework. It makes the life of the web server guy much easier — what rewrite? Just proxy or pass through the whole damn thing!

I am hoping more and more PHP applications will use simplified rewrite rules. Let applications themselves take care of parsing the REQUEST_URI, instead of generating a million lines of Apache mod_rewrite rules and dumping them into .htaccess files.

Meanwhile, Nginx users will have much easier time porting those rules then the Lighttpd users.

Comments

Gravatar

Thanks for a very useful article. Have you tried this on Drupal 5? Also have you tried increasing performance even further with third party caching software or do Drupal’s built in performance options do the job? Thanks!

Gravatar

Will — I have not upgraded my sites to Drupal 5 yet, but from the .htaccess file, the rules should be the same for 4.7 and 5.

I think Drupal’s built-in caching is good enough for anonymous users. I haven’t had a busy community site to test out how well it scales with multiple user sessions, i.e. bypassing the cache.

Gravatar

Well when I eventually get lighty installed on my test rig I’ll be able to confirm it working with Drupal 5 hopefully. I’m planning a community site which needs to be able to scale up quickly so I’m looking for the best PHP optimiser for use with Drupal to help with load. That and also to help me get away with a cheaper VPS initially ;)

Gravatar

I can’t seem to enable clean URLs in Drupal 5 + nginx.

I have setup everything as per your documentation. When I open the drupal clean URL settings page and click on the clean URL test, it works ,but the option to enable clean URL is still disabled. So I can’t enable clean URLs even though they work

Any clues ?

Gravatar

The configuration example was only partial. First of all, you need to put the rewrite rules under server { } section of the configuration file. For example,

http {
  ...
  ...
  server {
    server_name www.my-drupal-site.com;  
    if (!-e $request_filename) {
      rewrite ^/(.*)$ /index.php?q=$1 last;
    }
  }
}

I am running Drupal 5 on Nginx on my devel box and it has no issue doing clean URL.

Gravatar

Thanks, this worked out great for me with lighttpd 1.4.11 and drupal 5.1. It’s clever in the simplicity of the work around (not that I ever would have thought of it.)

Gravatar

The example for lighttpd worked great for me on lighty 1.4.13 and PHP 5.2.1-pl3-gentoo. Just wanted to say thanks!

Gravatar

Dude, you’re a genius - thanks!

Gravatar

nginx version: nginx/0.5.17 drupal 5.1

When attempting to run the clean url test the Enable/Disable box never gets enabled. What am I missing?

    location / {
            root /var/www;
            index index.php;

            if (!-e $request_filename) {
                    rewrite ^(.*)$ /index.php?q=$1 last;
                    break;
            }

    }

If I recall I fixed this last time by going into the database and manually changing the field clean url field but not I cannot seem to locate it.

Thnx

Gravatar

Hi,

Lxadmin (lighty) now comes integrated with permalink configuration for different apps. Currently we support wordpress and drupal, and we will be adding all the apps depending on the requests we get in our forum.

A user can go to [b] domain home -> permalink [/b] choose the application, and the path, and lxadmin will add the corresponding lighty rewrite rule/404 to the lighty config. Currently only one app per domain is possible though.

Thanks.

Gravatar

[code] “^/system/test/(.)$” => “/index.php?q=system/test/$1”, “^/([^.?])\?(.)$” => “/index.php?q=$1&$2”, “^/([^.?])$” => “/index.php?q=$1” [/code]

I used the re-write rules above to get lighttpd to give me clean drupal URLs.

Gravatar

”^/system/test/(.)$” => “/index.php?q=system/test/$1”, “^/([^.?])\?(.)$” => “/index.php?q=$1&$2”, “^/([^.?])$” => “/index.php?q=$1”

Ooops, I messed up the code block.

Gravatar

Great script.

Just something I noticed, the Drupal rule for lighttpd seems to work great except it prevents running update.php for some reason. I had to comment it out in order to get update.php to work. It would just return to the initial update.php screen instead of passing the form, and running the updates.

Gravatar

This code has been modified and allows update.php to be ran.

if (strpos($SERVER[‘SERVERSOFTWARE’], ‘lighttpd’) !== false) { $lightyurl = $baseurl.$SERVER[‘REQUESTURI’]; $lightyurl = @parseurl($lightyurl);

if ($_lighty_url['path'] != '/index.php' && $_lighty_url['path'] != '/update.php' && $_lighty_url['path'] != '/') {
    $_SERVER['QUERY_STRING'] = $_lighty_url['query'];
    parse_str($_lighty_url['query'], $_lighty_query);
    foreach ($_lighty_query as $key => $val)
        $_GET[$key] = $_REQUEST[$key] = $val;
    $_GET['q'] = $_REQUEST['q'] = substr($_lighty_url['path'], 1);
}

}

Gravatar

Where do I locate .htaccess file in Drupal?

Gravatar

thank you.

Gravatar

Hello,

I successfully enabled clean urls in my portal. Thanks :)

Gravatar

For the lighttpd code, taxonomy autocomplete feature doesn’t work. I have done some investigations and it seems to fail because of encoding of the comma “,” character.

After doing some tests, the following code seems to work right using the urldecode() function to convert from %XX to special characters:

if (strpos($_SERVER['SERVER_SOFTWARE'], 'lighttpd') !== false) {
    $_lighty_url = $base_url.$_SERVER['REQUEST_URI'];
    $_lighty_url = @parse_url($_lighty_url);

    if ($_lighty_url['path'] != '/index.php' && $_lighty_url['path'] != '/'
            && $_lighty_url['path'] != '/update.php') {
        $_SERVER['QUERY_STRING'] = $_lighty_url['query'];
        parse_str($_lighty_url['query'], $_lighty_query);
        foreach ($_lighty_query as $key => $val)
            $_GET[$key] = $_REQUEST[$key] = urldecode($val);
        $_GET['q'] = $_REQUEST['q'] = urldecode(substr($_lighty_url['path'], 1));
    }       
}
Gravatar

Regarding Nginx and the problem with encoding the ‘+’ as in the search module, this is no longer a problem in the current ‘stable’ release of Nginx which is 0.6.x.

However, Nginx is not perfect, and so far I have found one small issue with its rewrite engine. When you use regular expression in Nginx’s rewrite rules, it will try to encode the matches in the replacement URL. So far I have seen it broke Drupal’s search module. For example, if you search for “Hosting Fu”, Drupal will use the following URL:

Post new comment

The content of this field is kept private and will not be shown publicly.

More information about formatting options