Stupid Apache Tricks

First off, let me briefly apologize to anyone who clicked through to this article expecting to see pictures of Indians Native Americans performing bizarre stunts such as swallowing goldfish, fighting the noble polar bear or drinking entire cans of Lysol and surviving to tell the tale while really, really high. On Lysol.

This article, rather, is about the Apache webserver, an open-source workhorse that powers many of the sites you enjoy on the intarwebs each and every day. Like that site with the adorable kittens with really poor grammar. Or more importantly, the BBC website. Or most importantly, this website.

Fellow nerds who, like your humble narrator, are engaged in the practice of developing websites for the continued awesomeness of these great internets of ours may find great value in the content to follow. This tutorial may be of particular interest to practitioners of something I shall refer to as “Rails development,” which (believe it or not!) has everything to do with websites and nothing to do with trains.

For you see, it has been my experience that while there have been several blog posts, articles, wiki pages, mailing list threads, mash notes and carrier pigeon-borne field manuals written on the topic of setting up a local webserver for the purposes of developing websites, there is currently no writing on the subject of how my local Apache environment is set up. And as I find my webserver setup to be elegant, useful and thoughtfully organized, I think it is well past time I get up on my high horse and tell you people how it’s done.

In short: if you are not a web developer you will want to get out while you still can. For serious nerdery follows after the jump.

Rolling Yer Own Apache

First off, let me say that building Apache 2.2.x is one of my least favorite things in the universe, but I find it on balance to be way simpler and easier than the alternative, which is teaching Mac OS X Leopard’s bundled Apache install to do what I want it to do. I’m generally the kind of anal-retentive malcontent who installs things from source, so if that’s not your particular cup of tea, you can skip down to the section on configuration provided you’ve already got a usable Apache install, or a non-source way to install one (such as apt-get or MacPorts).

Leopard does ship with Apache 2.2.6 preinstalled, but with an odd, incomplete configuration that’s optimized for some of OS X’s built-in Web Sharing magic, which personally I can do without. (Though if you happen to use Web Sharing for anything important, please note that rolling your own Apache will require that you disable Web Sharing, at least while your custom-built server is running.) So let’s build our own Apache from scratch.

Just so you know, we’re gonna be installing our new stuff in /usr/local. You can read Dan Benjamin’s blog post about why this is a good idea, or you can just trust me that this is a good idea.

Also, you will need to install Apple’s Xcode developer tools before going any further. (Obviously, if you’ve already installed Xcode you’re ready to rock.) If you (like me) are a TextMate user, I also strongly recommend installing the mate command line tool so you can invoke TextMate’s ninja-like editing power from the Terminal.

Building Apache

First, let’s create a folder in our home directory and cd into that new folder. This will be our workspace for all our downloading and compiling needs. (You can delete this folder after we’re done, if you want.)

mkdir -p ~/src
cd ~/src

Next, we need to download and unzip a current copy of the Apache source code:

curl -O http://mirrors.ibiblio.org/pub/mirrors/apache/httpd/httpd-2.2.8.tar.gz
tar xzvf httpd-2.2.6.tar.gz
cd httpd-2.2.6

Then we’ll need to configure our Apache install. We’re going to tell it to install into /usr/local/apache2, and to include a fairly typical set of modules, including mod_proxy, mod_rewrite and SSL support. If you need any modules other than the ones listed here, feel free to season this to taste. I find that pretty much any module I need (apart from the ones specifically listed) is covered by the --enable-module=most directive.

./configure \
--prefix=/usr/local/apache2 \
--enable-module=most \
--enable-shared=max \
--with-mpm=prefork --enable-ssl \
--enable-proxy --enable-proxy-http \
--enable-rewrite \
--enable-headers

Next we compile…(this part may take a while)

make

And when that finishes, we install:

sudo make install

Clearing a PATH

And when that finishes, we’re done, right?

Well, pretty much. We’ve installed Apache 2.2.8, and it should be ready to start up. But our Apache is, as I said, in /usr/local/apache2/bin, and unless you want to type that whole path to the Apache command line tools every time you start or stop your server, you’ll want to add Apache to your PATH.

To do this, edit your user’s .bash_profile file by typing the following command (if you don’t have TextMate, you can use vi, emacs or whatever other editor you prefer):

mate ~/.bash_login

This will open the file if it already exists, or open a blank file if it doesn’t. Add the following line at the very end of the file:

export PATH="/usr/local/bin:/usr/local/sbin:/usr/local/apache2/bin:$PATH"

Now save and close the file. Then, to make sure the changes are picked up by the shell, execute the file:

. ~/.bash_login

At this point, when you run the command which apachectl (i.e., asking where the apachectl program is, if installed), it should return /usr/local/apache2/bin/apachectl.

If it says anything other than that, you did something wrong. Likewise, if you ran into any errors during the process, you probably did something wrong. Like, you know, not installing Xcode like I told you to.

But I know you, Gentle Reader, and I know you wouldn’t make such foolish, careless mistakes. So I will from this point on assume that you are now the proud installer and maintainer of your own custom-built Apache 2.2.6.

So onward to configuration! 

I’m In Ur httpd.confs, Confing Ur Httpds

David's Apache conf directory structure

First off, why don’t you take a peek at my httpd.conf file?

Now let’s talk about structure, specifically that of your newly created /usr/local/apache2/conf directory which contains all of your various .conf files. I’m pretty sure the original and extra folders are created as part of the default install. If I sound unsure it’s because I never use those.

You can see an image of my conf directory layout here on the right. I always create three subdirectories in there:

  • includes: This is for bits of reusable/repetitive configuration, like enabling SSL or specifying rewrite rules for a given Rails application. Basically these .confs are pre-defined behaviors, which I can drop into a VirtualHost or the main server config with a simple include statement, such as

    Include conf/includes/generic_ssl.conf

  • ssl: This stores all my local SSL certificates (which are all self-signed). We’ll talk about this folder more in just a moment.

  • vhosts: If I may direct your attention to the last line of my httpd.conf, you’ll note that there’s a couple of NameVirtualHost directives (for my local IP, on ports 80 and 443) and a wildcarded Include directive which basically slurps in every file in the conf/vhosts folder. Rather than having one enormous config file in which I define every single virtual host and configure every single server feature, I prefer to split my vhosts into separate config files by project. For example, t26.conf contains the two VirtualHost blocks (one secured, one not) for my local dev environment for T-26.

I find that 90% of the configuration business I do with my local Apache revolves around vhosts, so let’s talk a bit more about this VirtualHost business, shall we?

Virtual Insanity

If you take a look at the vhost conf file for a current project of mine, you’ll notice right off the bat that even though this is a local Apache install, I’m using a real domain name — mim.local.practical.cc.

I host and self-manage the DNS for practical.cc at DNS Made Easy, and to hook this up I created an A record for *.local.practical.cc, pointing to 127.0.0.1. It’s a wildcard, enabling me to create an unlimited number of local hostnames for all my many projects.

Not only is this convenient and easy to set up, it allows me to maintain a naming convention for all my various project sites and resources. For example, I have a similar wildcard set up for *.beta.practical.cc, which points to a similar Apache/Mongrel hosting environment on my demo server. There’s also *.dev.practical.cc, which maps to the Warehouse SVN browser for a given project. The wildcard for each of these URLs is the project’s “short name” (“mim”), which is also the name of the Rails app folder (both locally and online), the prefix for its database names (e.g. mim_production), the name of its SVN repository and — of course — the name of its .conf file in Apache.

I use virtual hosts pretty much only for Rails apps, and I have a fairly standard configuration template I use for proxying Apache requests over to Mongrel processes. You can check out the rewrite rules for the mim project here. With the obvious exceptions of the port number and the location of the log file, the rewrite rules are the same for every project.

The rewrite rules live in their own project-specific .conf file, which lives in conf/includes and is slurped into the app’s virtual host config via the Include directive. I happen to like the shorter, cleaner configuration this approach yields, but the real reason for this is way more important: it keeps me from having to write the same rewrite rules twice when I set up the SSL version of a given app.

Warm, Secure and DRY

Ah, SSL — we can’t live with you, yet we can’t live without you. A full, detailed discussion of all the ins and outs of server security is way, way beyond the scope of this writing. My interest here is rather just to show you how I address my primary SSL need on my local Apache install, namely setting up and testing Rails’s ssl_requirement plugin in my clients’ apps.

Let me tell you a couple things about self-signed certificates. A self-signed certificate is just like the expiring, $20+ per year “real” certificates you would buy from a “Certificate Authority” such as Verisign or GoDaddy in all but two respects:

  • You, while lovely and brilliant to be sure, are not a Certificate Authority. So your self-signed certificates will cause browsers to throw up a nasty warning dialog at least once per session. Once you’ve accepted the certificate, the browser will behave normally.

  • It’s free. Which is a good damn thing, since what we’re going to want here is a wildcard certificate (to go with our wildcard domain), which usually retails for $200 or more at even a bargain-basement CA like GoDaddy.

Rather than bore you with another long series of Terminal commands, I’ll just direct you to this short, sweet tutorial on creating self-signed certificates. The only things I did differently from these instructions were:

  • To make the wildcard URLs work, I entered *.local.practical.cc as my domain name.

  • Instead of server.* for all the filenames, I used localhost.*. (I may at some point need or want to use a separate certificate, perhaps even a real certificate from a CA, on my local machine, so the nomenclature helps specify which certs are for which hosts.)

  • On Step 4, “Generating a Self-Signed Certificate,” openssl on my Mac threw some kind of random number error. But it did generate a usable .crt file, so I ignored it and everything works fine.

There are a couple lines of SSL configuration back in httpd.conf, but most of the magic happens in my generic SSL configuration include, a 14-line file which enables SSL for the current domain/vhost, specifies my wildcarded certificate and public key files, and — importantly, for Rails apps — sets the X_FORWARDED_PROTO header Rails needs to distinguish proxied requests as having come from an https request rather than plain old http.

I can then include this file into any of my secure VirtualHosts (the ones listening on 127.0.0.1:443) that have ServerNames set to one of my *.local.practical.cc domains, thus enabling local SSL in one line of code.

More Stupid Apache Tricks

I think I mentioned at least once that real security matters and production server setup were beyond the scope of this article, and they are. But there are a couple of little moves I’ve picked up recently that have been super-helpful.

Anyone who’s fumbled around with Apache virtual hosting for a while probably knows that once NameVirtualHost is enabled, Apache basically starts ignoring its default configuration and instead relies on your various VirtualHost blocks to determine which content to serve for a given request. Generally, for a current project, this will behave as expected — you’ll have a vhost for the project, and Apache will serve up that site.

Occasionally, however, you’ll (that’s to say, I’ll) have someone — like the Googlebot — try to navigate to a disused project URL, which will annoy the visitor at best or throw up a mess of crazy Rails exceptions at worst.

So after I moved the Practical Creative & Code site to my Blue Box VPS late last week, I decided to add a couple of rules to gracefully handle outdated requests.

Here’s the VirtualHost block for practical.cc (which, I should mention, is a simple static site, not a Rails app):

<VirtualHost *:80>
  DocumentRoot /home/practical/websites/practical.cc/public
  ServerName practical.cc
  ServerAlias www.practical.cc
  ServerAlias beta.practical.cc
  ServerAlias beta.practicalmadness.com
  DirectoryIndex index.php index.html

  RewriteEngine On
  RewriteCond %{SERVER_PORT} ^80$
  RewriteCond %{HTTP_HOST} ^.+\.practical\.cc$
  RewriteRule (.*) http://practical.cc$1 [R=Permanent,L]
</VirtualHost>

Unlike most of my vhosts, this one is defined in httpd.conf ahead of my handy wildcarded conf/vhosts/*.conf include, so it’s always the first available vhost, making it the default handler for requests not being handled by one of the other vhosts.

The really useful magic is in those rewrite rules at the end. In a nutshell, they direct any request from any domain besides practical.cc to my company site. This not only gracefully bounces an outdated project URL like http://segurajobs.beta.practical.cc/ to my homepage, it also enforces my preferred domain name format for the company site — practical.cc, as opposed to www.practical.cc or somethingelse.practical.cc.

This kind of trick comes in really handy on SSL-secured sites like T-26. T-26’s security certificate is registered to www.t26.com, but until recently the site was also accessible at just t26.com. The problem here was that since the SSL certificate specified the www. as part of the hostname, certain browsers would throw up a confusing error message whenever users tried to navigate to a secure area of the site. Such as the page where they pay T-26 some money. You know, the Checkout page.

The error message made customers wary of giving T-26 money because they weren’t sure if their billing information would be transmitted securely. It would have been, but it’s my experience that most non-power users tend to turn their brains off at the first sign of an error message. It’s also my experience that browser SSL error messages are some of the worst-written, most confusing and most needlessly terrifying error messages you’re likely to find, so it’s not surprising that people would not just trust me when I told them that the “…identity of this website could not be verified” error was harmless.

But I digress. Suffice to say people were not wanting to give us their money. And we like their money, so obviously something had to be done. So I used a similar rewrite trick to force all requests to use the SSL-friendly www.t26.com domain:

# SERVER_PORT <> Port:80
RewriteCond %{SERVER_PORT} ^80$
RewriteCond %{HTTP_HOST} ^t26\.com$
RewriteRule (.*) http://www.t26.com$1 [R=Permanent,L]

# SERVER_PORT <> Port:443
RewriteCond %{SERVER_PORT} ^443$
RewriteCond %{HTTP_HOST} ^t26\.com$
RewriteRule (.*) https://www.t26.com$1 [R=Permanent,L]

The regexy difference here is that the T-26 rewrite looks for the absence of a subdomain, whereas Practical’s looks for the presence of one. But the approach is the same even if the result is opposite.

Coming soon: an updated version of this tutorial featuring info on how to install Phusion Passenger—aka mod_rails—for the best Rails development environment ever.