Tom Cannaerts

Using RewriteMaps for better performance

Most PHP developers know Apache‘s mod_rewrite, and have likely written at least a few custom RewriteRule in .htaccess. In this post, I want to introduce you to RewriteMap, a directive that allows you to use key:value lookups in your rewriting rules.  They are easy to use, fast and very maintainable. Do keep in mind that you will need to be able to modify the Apache config / vhost, as using RewriteMap in .htaccess is not allowed. You can use the map in .htacces, you just can’t define it there.

Getting started

The easiest way to explain how RewriteMap work, is to use an example. In the example I’ll be using, we’re running a webshop at http://shop.tom.be. A product URL might just look like http://shop.tom.be/products/53846, where 53846 is the id of the product-record in our database.

As business flourishes, a migration to a new eCommerce solution is needed. With this new eCommerce solution comes a new URL structure. Our beloved products 53846 suddenly becomes http://www.tom.be/shop/hoodies/toms-adventures-hoodie-black.  As we don’t want to loose to much of our SEO ranking by showing loads of 404 page-not-found errors, we need to rewrite the old URLs to the new one.

Using the classic method of RewriteRule, we could easely come up with this:

RewriteRule ^products/53846$ http://www.tom.be/shop/hoodies/toms-adventures-hoodie-black [R=301]

This might just work for a small number of URLs, but if we have to do this for hundreds or even thousands of URLs, it would make this very hard to maintain and performance would probably be poor because of the large amount of rules that need to be checked. To overcome both problems, you can use a RewriteMap. The RewriteMap directive looks like this

RewriteMap <name> <type>:<source>

<name> is the name you will be using in your RewriteRules to refer to the map. <type> defines how the map will be accessed, and <source> defines where the map can be found.

Let’s start simple

RewriteMap productmap txt:/path/to/products.txt

# /path/to/products.txt
53845 t-shirts/developers-idiots-vs-universe
53846 hoodies/toms-adventures-hoodie-black
53847 hoodies/toms-adventures-hoodie-white

No rocketscience there. We define a map productmap, which is a plain text map. The file contains the keys, followed by a whitespace, followed by the value we want to return. Our RewriteRule (yes, just one), would look like this:

RewriteRule ^products/([0-9]+)$ http://www.tom.be/shop/${productmap:$1} [R=301]

The server will now lookup the id captured in $1 in the map, and use the value returned by the productmap to construct the URL. The server will also cache the lookup, so that future requests will be processed faster. They remain cached until either the mapfile is changed or Apache is restarted.

While txt files are very easy to use and read, performance is not optimal as it needs to scan the file to find a match. To speed things up, you can convert your txt file to a DBM database file, which uses indexes. To convert a txt file to a dmb file, you can use the httxt2dbm program that comes with Apache.

httxt2dmb -i products.txt -o products.dbm

Our RewriteMap has to be changed to the dbm type and point to the .dbm file as follows:

RewriteMap productmap dbm:/path/to/products.dbm

Whenever you change the .txt file, you will need to re-run the httxt2dbm program to generate the .dbm file again. There’s no need to restart Apache as it will notice when the .dbm file has changed.

That’s it?

Glad you asked, actually there’s even more. The rnd type allows you to work with random values, and the fastdbd type even allows you to use SQL SELECT queries to do the lookup. The prg type allows you to pass data over to an external program to make the match.

However… the main thing to ask yourself if these are things you would want to be doing using mod_rewrite. When running a query or calling a program, you probably will want to have more control over the values that are passed in, as they might pose a security risk when tampered with. If your use-case goes beyond the simple key-value lookups, you are probably better of writing a CGI/PHP script to handle these request.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.