Apache The Definitive Guide, 3rd EditionApache: The Definitive GuideSearch this book

Chapter 17. mod_perl

Contents:

How mod_perl Works
mod_perl Documentation
Installing mod_perl — The Simple Way
Modifying Your Scripts to Run Under mod_perl
Global Variables
Strict Pregame
Loading Changes
Opening and Closing Files
Configuring Apache to Use mod_perl

Perl does some very useful things and provides such huge resources in the CPAN library (http://cpan.org) that it will clearly be with us for a long time yet as a way of writing scripts to run behind Apache. While Perl is powerful, CGI is not a particularly efficient means of connecting Perl to Apache. CGI's big disadvantage is that each time a script is invoked, Apache has to load the Perl interpreter and then it has to load the script. This is a heavy and pointless overhead on a busy site, and it would obviously be much easier if Perl stayed loaded in memory, together with the scripts, to be invoked each time they were needed. This is what mod_perl does by modifying Apache.

This modification is definitely popular: according to Netcraft surveys in mid-2000, mod_perl was the third most popular add-on to Apache (after FrontPage and PHP), serving more than a million URLs on over 120,000 different IP numbers (http://perl.apache.org/outstanding/stats/netcraft.html).

The reason that this chapter is more than a couple of pages long is that Perl does not sit easily in a web server. It was originally designed as a better shell script to run standalone under Unix. It developed, over time, into a full-blown programming language. However, because the original Perl was not designed for this kind of work, various things have to happen. To illustrate them, we will start with a simple Perl script that runs under Apache's mod_cgi and then modify it to run under mod_perl. (We assume that the reader is familiar enough with Perl to write a simple script, understands the ideas of Perl modules, use( ), require( ), and the BEGIN and END pragmas.)

On site.mod_perl we have two subdirectories: mod_cgi and mod_perl. In mod_cgi we present a simple script-driven site that runs a home page that has a link to another page.

The Config file is as follows:

User webuser
Group webuser
ServerName www.butterthlies.com

DocumentRoot /usr/www/APACHE3/APACHE3/site.mod_perl/mod_cgi/htdocs
TransferLog /usr/www/APACHE3/APACHE3/site.mod_perl/mod_cgi/logs/access_log
LogLevel debug

ScriptAlias /bin /usr/www/APACHE3/APACHE3/site.mod_perl/cgi-bin
ScriptAliasMatch /AA(.*) /usr/www/APACHE3/APACHE3/site.mod_perl/cgi-bin/AA$1

DirectoryIndex /bin/home.pl

When you go to http://www.butterthlies.com, you see the results of running the Perl script home:

#! /usr/local/bin/perl -w
use strict;

print qq(content-type: text/html\n\n
<HTML><HEAD><TITLE>Demo CGI Home Page</TITLE></HEAD>
<BODY>Hi: I'm a demo home page
<A HREF="/AA_next">Click here to run my mate</A>
</BODY></HTML>);

On the browser, this simply says:

Hi: I'm a demo home page. Click here to run my mate

And when you do, you get:

Hi: I'm a demo next page

Which is printed by the script AA_next:

#! /usr/local/bin/perl -w
use strict;

print qq(content-type: text/html\n\n
<HTML><HEAD><TITLE>NEXT Page</TITLE></HEAD>
<BODY>Hi: I'm a demo next page
</BODY></HTML>);

Naturally, this is a web site that will run and run and make everyone concerned into e-billionaires. In the process of serving the millions of visitors it will attract, Perl will get loaded and unloaded millions of times, which helps to explain why they are running out of electricity in Silicon Valley. We have to stop this reckless waste of the world's resources, so we install mod_perl.

17.1. How mod_perl Works

The principle of mod_perl is simple enough: Perl is loaded into Apache when it starts up — which makes for very big Apache child processes. This saves the time that would be spent loading and unloading the Perl interpreter but calls for a lot more RAM.

If you use Apache::PerlRun, you get a half-way environment where Perl is kept in memory but scripts are loaded each time they are run. Most CGI scripts will work right away in this environment.

If you go whole hog and use Apache::Registry, your scripts will be loaded at startup too, thus saving the overhead of loading and unloading them. If your scripts use a database manager, you can also keep an open connection to the DBM, and so save time there as well (see later). Good as this for execution speed, there is a drawback, in that your scripts now all run as subroutines below a hidden main program. The problem with this, and it can be a killer if you get it wrong, is that global variables are initialized only when Apache starts up. More of this follows.

The problems of mod_perl — which are not that serious — almost all stem from the fact that all your separate scripts now run as a single script in a rather odd environment.

However, because Apache and Perl are now rather intimately blended, there is a corresponding fuzziness about the interface between them. Rather surprisingly, we can now include Perl scripts in the Apache Config file, though we will not go to such extreme lengths here.

Since things are more complicated, there are more things to go wrong and greater need for careful testing. The error_log is going to be your best friend. Make sure that correct line numbers are enabled when you compile mod_perl, and you may want to use Carp at runtime to get fuller error messages.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.