Once you have the source code and have unpacked it (I won't go into that here.... if you don't know how to download and install a source rpm or a tarball, you probably shouldn't be trying to do it this way anyway), it will be necessary to configure and compile it. Do not mistake this configuration process for the apache run-time configuration process. They're apples and oranges. Configuring the compile-time options determines such defaults as where apache will look for its configuration files, where it looks for web pages to serve, and several other important features.
Since you're one of the adventurous souls compiling your own apache, you get to make some decisions. Do you want to be follow RedHat's structure, or maybe SuSE, or maybe you want to stay "True to GNU" and follow the guidelines layed out in 'make-stds', or maybe you're one of the FHS puritans and demand that non boot-essential config files not reside in /etc, but rather in /etc/opt/<package>. The choice is yours. Many of these important choices are gathered into one location for you, the layout file config.layout. Edit this file by either altering one of the existing entries, or creating a new one that matches your desired configuration.
| Option | Meaning |
|---|---|
| prefix | This is the basic top level directory of most things apache. This option is almost purely at your discretion |
| execprefix | The prefix for binary files. Typically /usr |
| bindir | Where apache places binary files. Typically $execprefix/bin |
| sbindir | Where apache places system binaries. Typically $execprefix/sbin |
| libexecdir | Where apache looks for what can best be described as apache-specific helper files. These include such things as dynamic modules, which we'll discuss later. Typically /usr/lib/apache |
| mandir | Where apache will install the manpage(s) Typically /usr/man |
| sysconfdir | Where apache looks by default for the runtime config files. Typically /etc/httpd/conf |
| datadir | Used to mark the top level of the directory tree containing the
data to be served Typically the same as $prefix |
| iconsdir | Where apache looks for icons representing various mime-types when serving ftp directories as web-pages. Typically $datadir/icons |
| htdocs | This is the "document root", where your main "index.html" lives. Typically $datadir/htdocs |
| cgidir | The default location for cgi executables. Typically $datadir/cgi-bin |
| includedir | Location for include files for compiling apache add-ons. Typically $prefix/incluce/apache |
| localstatedir | Where apache stores state files. Typically /var |
| runtimedir | Where apache will store runtime state files. Typically $localstatedir/run |
| logfiledir | Default location of apache log files. Typically $localstatedir/log/apache |
| proxycachedir | Where apache will store cached files if you've included the proxy module as part of the configuration. Typically $localstatedir/cache/apache |
Now when I say typically this does not necessarily mean that's where you'll find these files on your system. RedHat, SuSE, GNU, the HFS standard, Debian, Storm, and Mandrake all have different things to say about it. These decisions are yours to make, and you should make them based on your analysis of current and future requirements, disk space available, backup criticality, corporate policy, and your own personal preferences. The only recommendation that I will strongly make is that you make every effort to make your paths relative to the $prefix, $datadir, and $localstatedir directives rather than completely static. This makes changing the config much easier, and when we start talking about security, that can become important.
Now that you've decided your file layout structure, you need to consider what capabilities you want your web server to have. Several things that we take for granted about web servers may not be default behaviour. In general, the apache team included the most useful modules as part of the source distributions default configuration, but you should probably take a good look at src/Configuration.tmpl. Most modules can either be included statically in the binary or can be loaded dynamically by the server as needed (DSO -- Dynamic Shared Object).
Both methods have their pros and cons, and in general the normal guidelines for static vs. dynamic apply. The static method is the easiest, and makes for faster servers. The downside is that your webserver can suffer "Microsoft Syndrome" and can begin to take on swiss army knife features at the expense of memory efficiency and executable size. Using dynamic shared modules makes your overall executable size smaller, meaning less resources are required to handle multiple instances (apache uses the fork-ahead server model for those C coders keeping score at home) and children spawn faster. The downsides are that there is a measurable latency to loading/linking the module into apache on the fly, and DSO's don't execute quite as fast as static modules. Since benchmarking these tredeoffs is highly traffic-pattern dependant, and patterns tend to change over time, it's a real tough call at design time. In general, just make your best guess and forget about it.
| Mod Name | Description |
|---|---|
| actions | Executing CGI script based on media type or request method |
| autoindex | Automatic directory listings |
| cgi | Invoking CGI scripts |
| env | Altering the environment passed to CGI and SSI pages |
| imap | Improved support for server side image maps |
| include | Support for Server Side Includes |
| log_config | Configurable logging support. |
| mime_magic | Support for media types based on file contents (type). |
| mime | Support for media types based on common but braindead MIME type. |
| negotiation | Support for content negotiation. |
| proxy | Provides for HTTP 1.0 caching proxy support. |
| rewrite | URI to filename rewriting on the fly. |
| setenvif | Allows setting env variables based on request attributes. This is useful to deal with buggy browsers, or to deny cool features to MSIE users just for the fun of it. |
| so | Supports loading shared modules at runtime. |
| status | Provides information on server status and performance. |
| userdir | Supports user-specific directories (member home pages). |
| vhost_alias | Dynamically configured mass virtual hosting. |
Once you have decided your layout, and made your
decisions about modules, you're ready to configure the source code for
compilation. This important step sets up the makefiles to be compatible
with Linux and also sets up the proper linking options for your modules.
Go to the root of the apache source tree, and enter the command
./configure --with-layout=
MyLayoutName \
--enable-module=
module_name \
--enable-module=
module_name2 \
--enable-shared=
shared_module_name \
--enable-shared=
shared_module_name2 \
--disable-module=
unwanted_module_name
Since to get exactly the features you want, you may have to configure the source tree several times, I recommend that you create a small shell script for your configure command. It will save a lot of typing in the long run. Once your source tree has been configured, all you need to do is build and install the program.
Building apache once the source is configured is a snap. (Note that as for most installations, you will probably require root permissions to properly build and install apache.) Just enter the command
makeand you're on your way. If your linux box has a proper development environment set up (and it should, or you probably would have already skipped ahead to the configuration section) everything should go smoothly. Once the build has completed, installing apache is just a matter of typing
make install
We now need a way to start and stop apache on our system. Most distro's have a fairly good SYSV init template to copy somewhere in the /etc/rc directories, but apache provides a program called apachectl to start and stop the server if you want to use it. Now.... just because you've compiled and installed apache doesn't mean it's ready to run. You still need to configure the runtime environment. The fun is just starting.
| Directive | Hints |
|---|---|
| ServerRoot | If you configured sysconfdir to be /etc/httpd/conf then make this "/etc/httpd" |
| LockFile | This file is used by apache to decide if it's running or not. If the path does not start with a leading /, apache will assume the path is relative to the ServerRoot defined above. (RedHat /var/lock/httpd.lock) |
| pidfile | This file is where apache stores the process id of the server. If the path does not start with a leading "/" apache will assume the path is relative to the ServerRoot defined above. (Redhat /var/run/httpd.pid) |
| ScoreBoardFile | This file stores internal server information, but is not needed on
most Linux configurations. Just to be safe, create a place for it. (RedHat /var/run/httpd.scoreboard) |
| TimeOut | This is the number of seconds before net traffic times out. The
default on this is 300, which is 5 minutes. It can be set much lower, but values below 30 tend to cause problems. |
| KeepAlive | Allows persistant connections. Unless you have a good reasons to not want them, set this to "on". |
| MaxKeepAliveRequests | This determines the maximum number of Requests allowed on a persistant channel before it closes. 100 is a reasonable number |
| KeepAliveTimeout | Determines how long a KeepAlive channel will remain open if idle. 15 is a good number. |
| MinSpareServers | Sets the desired number of servers that are idle, awaiting requests. If there are ever less than this many of idle child processes, apache will start spawning more until this number is reached. Too many wastes resources. Too few and spikes in server hits could degrade performance. 2 is a good number for home or SOHO, 3 - 5 for a business or small university. |
| MaxSpareServers | Sets the maximum desired number of idle servers. If there are more idle servers than desired, apache will begin to kill off children, reclaiming their resources. 10 is the default, while for the hobbyist or SOHO user, a value of 5 can be used to save resources. |
| StartServers | The number of children to spawn at startup. The default is 5. Busy sites should set this higher, but not too high or you'll spend your first minute and a half spawning children and not serving requests. Apache will dynamically adjust the number of processes later, so setting this value very high is almost never useful. |
| MaxClients | This sets a ceiling on the number of child processes that can be spawned. It can be set up to 256 without modifying source code. |
| MaxRequestsPerChild | This sets the maximum number of requests that a child process will handle before dying. It is mainly useful on IRIX and SunOS where there are noticeable memory leaks in the libraries. A vaule of 0 will allow unlimited requests per child, and is claimed to be safe on Linux. I recommend a value of 1000, or 10000 for heavily loaded sites. |
| Listen | Determines the address and port number that apache will bind. This can be used to limit apache to a specific address. For instance, you can use Listen 127.0.0.1:80 to cause apache to respond only to requests from the localhost. The usual value is 80, which tells apache to listen on the HTTP port of all interfaces. Multiple Listen directives can be used. |
| BindAddress | Detemines which IP addresses apache will respond to. This is used on machines with multiple IP addresses (either through multiplexing or using multiple interfaces). The normal value is *, which causes apache to listen on all addresses. |
| ExtendedStatus | This is only useful if you have loaded mod_status, and tells apache to keep track of extended information on a per request basis. It cannot be used on a virtualhost by virtualhost basis. Set this value to "on" if you've decided to compile mod_status as a built-in module (recommended). |
| ClearModuleList | Apache has a list of modules that should be active. This directive clears that list. It is assumed that you will then turn on what you want using the AddModule directive. |
| AddModule | Modules are sort of complicated. When you compile apache, it gets a list of included modules, not all of which are "turned on". This directive is used to activate a built-in module. It can be used even if you haven't used the ClearModuleList directive. |
| LoadModule | This directive is used to load a dynamically loaded module (as oppossed to a built-in module. Order of execution can be important, so pay close attention to the example configuration and the documentation for any alternative modules you load. |
| <IfDefine></IfDefine> | This is used to conditionally execute directives based on whether or not a specific value is defined, usually by means of a command line switch (-D foo). One use for this is for a startup script to check for the existance of a module, and load/configure it if it exists (RedHat's startup script does this, for example). |
| Directive | Notes |
|---|---|
| Port | Here for historical reasons, and for setting the SERVER_PORT environment variable for CGI and SSI. Set this to whatever your HTTP port will be (usually 80). Note: This does NOT apply to virtualhosts. |
| User | Sets the user that apache will handle requests as. For security reasons, apache changes its effective UID before handling requests, so all of your documents must be accessible to this user. For this reason, it is useful to create a user called www or apache to use with your webserver. Running as the user nobody or as UID -1 does not work on all systems or with all libraries. |
| Group | Just as apache changes its UID, it also changes its GID. This is the group to change to. Once again, nobody can cause you some difficult to track-down problems, so it's probably a good idea to create a group. |
| ServerAdmin | Set this to the e-mail address that should receive all error notifications. |
| ServerName | Set this to the fully qualified domain name of the server. Also used when setting up name-based virtual hosts. If you don't set this, you will likely encounter problems on startup. |
| DocumentRoot | Set this to the directory to search for the main index file for this server. Apache will search for a file that matches your DirectoryIndex in this directory to display when no other page is requested (as when you request http://www.example.com) |
| UserDir | When using the mod_userdir module, this allows you to map requests to user's home directories instead of to the document root tree. Set this to "www" to map requests for http://example.org/~foo to ~foo/www on the example.org server, for example. For security reasons, if you use this, also use UserDir Disabled root. |
| DirectoryIndex | Used with mod_dir, this option sets the search order for files
when a user requests a directory listing by specifying a "/" at the
end of a directory name or for the document root. Normally this will
just return "index.html", but you could specify DirectoryIndex index.html index.php index.pl index.cgi to have apache search for each of these files, returning the first one it found. |
| HostNameLookups | Generally set to "off" to save the latency time of the DNS lookup, you can set this to either "on" or "double". "On" is useful to pass the hostname as REMOTE_HOST to CGI/SSI's and "Double" is the ultra-paranoid setting to detect spoofed requests. On heavily loaded sites this can cause some real slowdown, and most poeple don't need it. |
| ErrorLog | Sets the name of the file to use for error logging. As of version 1.3, you can also direct errors to the syslog facility. |
| LogLevel | Sets the level of information that apache will send to the error log. Defaults to "error". Possible options are "emerg", "alert", "crit", "error", "warn", "notice", "info", and "debug". These options follow the general content guidelines for syslog(3). |
| LogFormat | When using mod_log_config (recommended), this directive allows you
to customize the format of the log file. The options are many and
various. Read the documentation. The most commonly used is LogFormat "%h %l %u %t \"%r\" %>s %b" for main host, and LogFormat "%v %l %u %t \"%r\" %>s %b" for virtual hosts. |
| Alias | Allows for transparent redirection of requests. Typically used for icon, library image, and cgi directory redirection on a wholesale basis. Aliases are processed after <Location> stanzas and before <Directory> stanzas. |
| ScriptAlias | Has the same result as Alias, but also marks the directory as containing cgi scripts, so apache will process them as such. |
| AddHandler | If using mod_mime (recommended) this directive maps file extensions to
handlers. An example of this is using AddHandler cgi-script .cgi to cause any file with the extension .cgi to be treated as a cgi file. This overrides any previous mappings. |
| AddType | If using mod_mime (recommended) this directive maps file extensions to
MIME types. One particularly forward looking use for this directive
is mapping the ".xhtml" extension to text/html. An example of this is
using AddType text/html .xhtml to cause any file with the extension .xhtml to be treated as html by the client. Converting your html to xhtml will generally only have small impacts on presentation, which can almost always be mediated with proper adjustments to CSS. While it isn't fully desirable to treat xhtml as html, no major browser is fully XHTML aware as of yet, so waddayagonnadoo? |
| ErrorDocument | Allows you to set custom pages or scripts to handle HTTP exceptions and
errors. This lets you get away from the canned error messages and
allows for a more friendly and effective way to handle things like
broken links and access denial. Example: ErrorDocument 404 errordocs/404.cgi would invoke a custom error script when a file is not found on the server (bad typing or broken/obsolete link). |
The third section of the apache configuration file deals
with virtual servers. Virtual servers are defined in a <VirtualHost>
stanza. Stanzas are almost like HTML tags.... they start with a
<keyword> in angle braces, and end with </keyword>.
Other common examples of stanzas are <Location>,
<Directory>, and <IfDefine>. Directives inside
stanzas only apply within the scope defined by that stanza. For instance, if
you added
<Directory /home/foouser/public_html/*>
Order Deny, Allow
Deny from Joe
Allow from All
</Directory>
then the user Joe would have no access to files located under
/home/foouser/public_html, but his access would remain unaffected for all other
areas of your server. Sorry about the short digression.... stanzas are
important, and I'm running short of free time before the KPLUG meeting to
integrate them better into the turorial.
Let's give an example of setting up a name based virtual
host. We will assume that www.example.com and
www.foo.org point to the same IP address. In your
httpd.conf file you would add the following:
NameVirtualHost *
<VirtualHost>
ServerAdmin webmaster@example.com
DocumentRoot /www/docs/example.com
ServerName example.com
ErrorLog logs/example.com_error
</VirtualHost>
<VirtualHost>
ServerAdmin webmaster@foo.org
DocumentRoot /www/docs/foo.org
ServerName foo.org
ErrorLog logs/foo.org_error
</VirtualHost>
This is about all you need to get started. Of course, you may want to enable or disable certain features for each virtual host, like disabling cgi or enabling paranoid DNS lookups for logging purposes. Simply place the appropriate directives in the virtual hosts stanza, and you're done.
But what if you want to host hundreds of virtual hosts? Your httpd.conf would grow quick huge, be slow to load, and consume a lot of resources. The answer comes from dynmaically configured mass virtual hosting provided by mod_vhost_alias. If you enable this module, either as a dynamic module or built-in, you can use something like this:
# Turn off Canonical Names so CGI/SSI works properly UseCanonicalName off # Set the logging format for all virtual hosts LogFormat "%V %h %l %u %t \"%r\" %s %b" vcommon CustomLog logs/access_log vcommon # Dynamically include server names in file requests VirtualDocumentRoot /www/vhosts/%0/htdocs VirtualScriptAlias /www/vhosts/%0/cgi-binWith this setup, a request to
http://www.virtualhost.com/foo/bar.html would map to a request for the
file /www/vhosts/www.virtualhost.com/htdocs/foo/bar.html. You can
still use <Directory> and other stanzas to control things on a
directory by directory basis.
One interesting thing you can do with virtual hosts is make your own web server perform differently by how you access it. For instance, on my web server at home, I have my DNS set up with several aliases to the web server, like "docs", "weather", "mirror", "daily", "rfc", and "howto". I then access my webserver by different name, like "http://rfc" or "http://weather" to access the right sets of pages.
has a four part series of tutorials on apache logging that you can view online.
Apache's negotiation rules can be quite complex, so it's a real good idea to
read the documentation if you
really want to fine-tune your website, but basic negotiation is actually
quite easy. First, ensure that
mod_negotiation is enabled for your server
(since it is compiled in by default, unless you changed that, your're OK).
Second, add a handler for type-map, usually by including the configuration
directive
AddHandler type-map .var
and third by setting up the type-map files themselves. Then instead of
hyper-linking to an image file or web-page, you hyperlink to the .var file,
and let Apache sort out what should get served. An example file that would
serve a page in a preferred language might be helpful here. If you create a
file called foo.var, and create a hyperlink to it, and fill in the contents
like this:
URI: foo.english.html
Content-type: text/html
Content-language: en
URI: foo.french.html
Content-type: text/html
Content-language: fr
URI: foo.german.html
Content-type: text/html
Content-language: de
Now when the user cliks on the link, Apache looks for a which language the
browser says it prefers (the Accept-language header), and will return
the right file. You can do the same thing with images. If you had a link
like <IMG SRC=./foo.var> and the foo.var file contained
URI: foo.jpeg
Content-type: image/jpeg; qs=0.8
URI: foo.gif
Content-type: image/gif; qs=0.5
URI: foo.png
Content-type: image/png; qs=0.3
then apache would look for the Accept-encoding header in the request,
and return the type of image that was 1) in the list of acceptable encodings,
and 2) had the highest qs value (these range from 1.000 to 0.000)
Now lets say you have a case where none options in your .var file are acceptable to the browser. Apache will return error 406 (NOT ACCEPTABLE), and a hyperlinked list of the possible options. This can be a cool feature with translated pages, but tends not to work too well with images, as you can probably imagine.
Of course, for the self-flagellating code-head types, you could "simply" use mod_actions to re-write documents into the desired format on the fly using CGI scripts, but you'd want a really fast server, lots of time on your hands to write the translators, a box of Chees-Its, and a case of Mountain Dew just to get started on such a project.
|
Modules mod_alias mod_cgi mod_mime |
Configuration Directives AddHandler Options ScriptAlias |
AddModule mod_mime.c
AddModule mod_cgi.c
AddModule mod_alias.c
ScriptAlias /cgi-bin/ /home/httpd/cgi-bin/
AddHandler cgi-script cgi
ScriptAlias maps requests for
http://www.example.com/cgi-bin/foo to the script
/home/httpd/cgi-bin/foo, and tells Apache that every file in the
cgi-bin directory should be treated as a CGI script.
The AddHandler directive tells apache that
files that ends with .cgi should be treated as a CGI program; that is,
if the file exists and is executable, Apache should run it. This example will
work anywhere in the document tree, not just the cgi-bin directory. You only
need this line if you wish to allow execution of CGI's outside the
ScriptAlias'ed directory. You could drop this directive into <
VirtualHost> or <Directory> stanzas to limit its scope.
No matter how you choose to configure your CGI
access, you may want to consider security along every step of the way.
Options -ExecCGI
<Directory /foo/bar/ >
Options +ExecCGI
<Directory>
<Directory /home/httpd/*/www/cgi-bin/ >
Options +ExecCGI
<Directory>
This disables CGI exection globally , but allows it for the /foo/bar
directory and any directory with a name that matches /home/httpd/*/www/cgi-bin.
This might be useful to allow exection of CGI's from user's home directories.
Interaction between ScriptAlias, Options, and the
AddHandler directives can be tricky, (ScriptAlias and ScriptAliasMatch
override Options, for example, while Options and the Handler work hand in
hand)
so it will require some experimentation on your part until you are
comfortable with the way things work.
Since this is strictly an Apache tutorial, we're not going to cover how to write CGI scripts, but if there is enough interest, KPLUG will do a CGI HOWTO in the future.
(Just as a side note, sites heavy in CGI should consider looking at the FastCGI module available from the Apache Module Registry to speed up CGI responsiveness.)
tar zvxf apache_1.3.x.tar.gz tar zvxf php-x.x.x.tar.gz cd apache_1.3.x ./configure with_whatever_options cd ../php-x.x.x ./configure --with-apache=../apache_1.3.x --with_whatever options make make install cd ../apache_1.3.x ./configure --activate_module=src/modules/php4/libphp4.a make make installTo compile PHP as a DSO (you'll need the apxs utility and DSO support already compiled into apache for this), simply do:
tar zxvf php-x.x.x.tar.gz cd php-x.x.x ./configure --with-whatever-options --with-apxs make make installSome people might prefer to just copy the binary of apache over the old apache binary, thus avoiding any possible overwrites of existing configuration files.
# Use the next line if PHP is a DSO, omit it otherwise LoadModule php4_module /path/to/php3/module/libphp4.so # These lines need to go in for both DSO and static AddModule mod_php4.c AddType application/x-httpd-php4 .php4 .phpThat's about it. Pretty simple. Again, this is an Apache tutorial, so we won't go into writing PHP programs, but if there is enough interest, KPLUG will whip up a tutorial.
tar zvxf mod_perl-1.x.tar.gz cd mod_perl-1.x perl Makefile.PL \ USE_APXS=1 \ WITH_APXS=/path/to/apxs \ EVERYTHING=1 \ [... more options if desired ] make make test make installPlease note that perl 5.003 requires patching (included in the mod_perl source tree in the INSTALL file) if you build mod_perl as a DSO. Apply the patch and recompile perl BEFORE building mod_perl.
# for Apache::Registry Mode
Alias /perl/ "/home/httpd/cgi-bin/"
# for Apache::Perlrun Mode
Alias /cgi-perl/ "/home/httpd/cgi-bin/"
# For /perl/* as apache modules written in perl
<Location /perl>
Perlrequire /path/to/apache/modules/perl/startup.perl
PerlModule Apache::Registry
SetHandler perl-script
PerlHandler Apache::Registry
Options ExecCGI
PerlSendHeader On
</Location>
# For /cgi-perl/* handling as embedded perl
<Location /cgi-perl>
SetHandler perl-script
PerlHardler Apache::PerlRun
Options ExecCGI
PerlSendHeader On
</Location>
# For mod_perl status information
<Location /perl-status>
SetHandler perl-script
PerlHandler Apache::Status
order deny, allow
deny from all
allow from localhsot
</Location>
# Include the next line if mod_perl is a DSO
LoadModule perl_module /path/to/apache/modules/libperl.so
AddModule mod_perl.c
While this, of course, just scratches the surface, there
is plenty of additional information available both in the pod files that
come with mod_perl, the apache
module help file, and on the mod_perl
home page. As with CGI and PHP, this isn't a Perl tutorial, but
if enough interest develops, KPLUG will eventually cover it in more depth.
# Use this to allow SSI in files. This can go in stanzas, too. Options +Includes # Or you can have SSI but disable executing scripts via SSI with Options +IncludesNOEXEC # Use this if mod_include is a DSO LoadModule includes_module /path/to/apache/modules/mod_include.so AddModule mod_include.c AddType text/html .shtml AddHandler server-parsed .shtml # Optionally, you could run *all* html files through the SSI parser. # This does no harm to non SSI html files, but slows you down a bit AddHandler server-parsed .html
echo 80 > /proc/sys/net/tux/serverport echo 8080 > /proc/sys/net/tux/clientport echo '/home/httpd/html' > /proc/sys/net/tux/documentrootThen just start /usr/sbin/tux, and you're on your way.
# Change httpd.conf # From: Port 80 to: Port 8080 # From: BindAddress * to: BindAddress 127.0.0.1With a little experimentation, you can get the webserver of your dreams up and running in less than a day.