Securing your Web server, Part 2

How to fine tune access to each directory on your server

Abstract

Server-wide security can only go so far. To provide local control over individual document directories, you'll need to fine tune the access rules for each directory on your server. (1,800 words)

Last month we started a three-part series on web server security, beginning with the basics of Web access control. The miraculous Web lets you go read that entire column, of course, but I'll recap the major points:

Web access control is based upon the domain name and/or IP address of the client requesting a document from the server.
Most Unix-based servers use a file named access.conf to define a set of access rules that determine which browsers are allowed to access the server
Using a directory-based syntax in this file, a separate set of rules can be defined for each document directory on your server.

Pros & cons
There are several advantages to the single access.conf file. Most importantly, it centralizes all of your access rules, making it easy to control the entire site. It is also easy to secure the file itself, since it need only be readable by the server daemon and you.

Unfortunately, these benefits also can be headaches. Many servers are actually shared environments, with different document directories managed by different authors. Each author wants to control his or her own access rules, but doesn't want other authors stepping on their portion of the access.conf file.

In addition, each change to the access.conf file necessitates either starting or restarting the server daemon, which usually requires root access on a properly configured server. Needless to say, you don't want all your authors running about with the root password.

To eliminate these concerns, the httpd-based servers support a separate access control file that can be located in each document directory on your server. These local files let individual authors control the access rules for their directories. Even better, the file is read dynamically by the server, eliminating the need for a server restart after each rule change.

Enabling local control
By default, these local access control files are disabled until you explicitly enable them in your access.conf file. To do that, you'll need to add an extra directive, AllowOverride, to each <directory> tag in the file.

As its name implies, AllowOverride allows the access control file in the specified directory to override the default access control rules (and other options) specified in the access.conf file. The directive can be given one or more values, corresponding to the type of override allowed. While some of these values address the way the directory contents are presented to browsers, the one that concerns us is limit. Adding this value to the AllowOverride directive enables local limits to be placed on who can access documents in the directory.

An example will make this clear. Suppose you want to allow local overrides for your top-level document directory, /usr/local/http/docs. You would place

   <directory /usr/local/http/docs>
    AllowOverride limit
   </directory>

in your access.conf file. Start or restart the server and you're ready to create local access control rules.

The .htaccess file
Those local limits go into a special file named .htaccess in the document directory. The syntax of the entries in this file are exactly like the contents of the <directory> directive, except that the starting <directory> and trailing </directory> tags are omitted.

To create access control rules, just place a <limit> directive in the .htaccess file. To ensure that the whole world can access the directory, you would use

   <limit>
    order allow, deny
    allow from all
   </limit>

To restrict access to just your company's domain, use something like

   <limit>
    order deny, allow
    deny from all
    allow from .mycompany.com
   </limit>

I covered the complete syntax of the <limit> tag in last month's column; jump there now to learn more.

Nagging details
There are many benefits to these localized control files:

Individual authors can manage access rules without needing special system privileges
Changes take effect immediately, without having to start or restart the server
Access rules travel with their associated documents if you move or rename the directory

There are also a few issues of which you should be aware:

Since local control is often enabled to give authors more control over their document sets, the chance that one of these authors will create an erroneous rule always looms. A bad rule may restrict people who should normally be granted access or worse, may allow access to those who should be kept out.
In that same vein, the permissions on the .htaccess file must be maintained so that only authorized people can edit the rules. A world-writable .htaccess file is a security hole awaiting exploitation.
Anyone who can read the documents in the directory can also read the .htaccess file. While most people don't think of it,

   http://some.place.com/secure-stuff/.htaccess

As an experiment, you might want to try retrieving .htaccess from various servers and see what you come up with.

For many Web sites, the advantages of local access control outweigh these problems, and I encourage Webmasters to implement local access control. It's fast and easy and makes management of large servers much easier, particularly when multiple authors are managing different parts of the same server. Take the time to understand the potential pitfalls and make sure you avoid them.

While we're on the subject
While you are fretting about the remote possibility of someone swiping a .htaccess file from one of your directories, it's appropriate to mention a far more common security hole that exists on thousands of servers, big and small, around the world. This hole, like the visible .htaccess file problem, is rooted in the way servers deal with Web directory access.

Every httpd-based Web server will allow access to a URL that points to a directory name instead of a document. When confronted with a directory, the daemon looks for a top-level index file, usually named index.html, and returns that document to the browser. This is why so many URLs reference index.html: it's the default index for an entire directory of documents.

Potential problems arise if this file does not exist. In this case, the server will create a generic index for the directory that looks something like the output of ls, possibly with some fancier icons attached. The idea behind this feature is that you can make directories of files available with little effort, mimicking an FTP-like interface via the Web.

This is a great feature if, in fact, you want every file in that directory to be visible to the outside world. Usually, the opposite is true: some of the files in a directory are "presentable" and the rest are "works in progress," not suitable for public display.

For your top-level document directory, you probably have an index.html document, so there is no chance that someone could get the raw listing for that directory and view files that you don't want displayed. Many Webmasters, however, create subdirectories below that top-level directory to better organize their files, but never create index.html files for those subdirectories. It's common to see Webmasters creating directories to hold just their images, and many servers have separate directories for all the CGI scripts needed by their documents.

If you reference any of these directories by name, you'll see all the files in the directory. One more click and you can view any of these files, whether or not the author has other links to the files. In the case of the CGI directory, you can find all the server-side programs available on that server. Since some of these programs are known security holes, this opens your server to potential penetration by a hacker.

You would be amazed at the number of high-profile sites whose directories are open to this kind of snooping. I won't give the names of any sites suffering from this problem, but I was able to find and download the shell scripts that implement the site search capability for a major U.S. university after about five minutes of searching. I'm sure you could find similar holes in other sites in short order.

The fix for this problem is simple: put an index.html file in every directory on your server. Even an empty file will close the security hole; a small snippet of HTML lets intruders know you know what they're up to:

   <html>
   <head>
   <title>Illegal Access</title>
   </head>
   <body>
   <h3>
   Direct access to this directory is not allowed. Please visit our
   <a href="/">home page</a> instead.
   </h3>
   </body>
   </html>

Don't suffer the embarrassment of having private pages made public. Take a moment to install index.html files on your server.

Next time
Next month, we'll conclude our security series with a tutorial on password protection. In September, we'll switch gears and explore how easy it is to make money on the Web.

About the author
Chuck Musciano has been running Melmac and the HTML Guru Home Page for two years, serving up HTML tips and tricks to hundreds of thousands of visitors each month. He's been a beta-tester and contributor to the NCSA httpd project and speaks regularly on the Internet, World Wide Web, and related topics. His book, HTML: The Definitive Guide, is currently available from O'Reilly and Associates.

Contd....