[Copyright Information] [Table of Contents] [Que Home Page]
[Prev Chapter] [Next Chapter]

08 - Understanding Basic User Authentication

by Paul Doyle

One reason why the Web has grown so quickly is that it is a very open environment. Services are set up for anyone who cares to use them, quite often with the aim of attracting as many users as possible.

Sometimes, though, services need to be restricted so that only designated people can use them. Restrictions can apply to files or directories, with different levels of access for different users. With such restrictions in place, authenticating the identity of all users who attach to the server is a priority.

This chapter explains how user authentication works and tells you how to set up and administer user accounts on an Apache Web server. This chapter is not about Perl per se, but is intended primarily to serve as a foundation for the rest of the chapters in Part III, "Authentication and Site Administration."

Basic User Authentication

A read-only Web service that is open to everyone presents no particular security problems. The files are made available in read-only mode, and the Web server process presents the files to any users who request them. As long as basic precautions are taken with regard to access rights to the files and the parent directory, the service is secure.

A service in which the user has write access to one or more files is a little more complex. The Guestbook example in Chapter 2, "Introduction to CGI," is an example of this type of service. Actually, the user does not really write to any files; the Web server process does that on the user's behalf. Making such a service secure means granting the Web server process the appropriate access restrictions (read-only to all files except the ones that it needs to write to) and making sure that your CGI program does not offer the end user any security loopholes, such as executing arbitrary strings involving user input through a HTML form.

The type of service with which this chapter is concerned has access restrictions that involve one or more of the following complexities:

Restrictions such as these are more relevant when you are using CGI programs than when plain old HTML files are involved. To understand why, you need to look at the way in which a Web server works.

Processes and User IDs

A Web server consists of a networked computer running special software under a special user ID. The fact that all three elements-hardware, software, and process-are referred to as Web servers in different contexts isn't very helpful. For clarity, I'll use the following definitions:

So the httpd runs as the httpd process under the httpd user ID on the Web server.

CGI Program Execution

The central issue here is that all file accesses on the Web server are performed by a httpd process. For this process to be capable of serving up files from many different directories, it must run under a user ID with relatively liberal access rights. If the process is to run CGI programs, too, it probably will require generous write access throughout the same directories.

This access is a security risk, however, because the process executes CGI programs under the httpd user ID on behalf of users from elsewhere on the Internet. In effect, you are allowing complete strangers to run programs on your system with a privileged user ID. A CGI program potentially can do anything on the server machine that a user who logged in under the httpd user ID could do, including reading, writing, creating, and deleting files. So if you're not careful, you can open your server to attack from anywhere on the Internet.

By default, the httpd user ID is the ID of the process that executes it. This user ID should not be root! Create a special user ID with limited privileges to start the httpd. Then you can start the httpd by using this user ID. Alternatively, if you are using the Apache httpd, use the User and Group configuration directives (described in "Apache Server Configuration Directives" later in this chapter) to get httpd to switch user IDs at startup time.

Access Control

The server's user ID is, of course, subject to access restrictions in the same way that any other user ID on the system is. But because you need to give the server read and write access to so many places, and because you also allow it to perform tasks on behalf of complete strangers, you need to introduce an extra layer of access control.

It is important to note here that this extra layer involves user IDs and passwords that belong to the server, not to the host system. In other words, the Web server has a password file that is separate from the /etc/passwd file on UNIX systems. Having a logon account on a system does not guarantee that a user can use a restricted Web page on the server, even if the user has read access to that page when logged on interactively. The opposite also is true-a user who doesn't have a logon ID can access restricted Web pages on the server if that user has the appropriate Web server user ID and password.

The extra security layer takes a different form for each HTTP server package, but in essence, this layer has two strands:

The "User Authentication on the Apache Server" section later in this chapter describes how access restrictions work on the Apache server and how to implement restrictions that are appropriate for your site. First, however, you need to consider the issues that are involved in verifying a user's identity.

User Identification

The effectiveness of your server's access restrictions depends on whether you can authenticate the identity of users who attempt to attach to your server. If you can't confirm that a user is who he says he is, you may as well make everything read-only and remove all sensitive information that you don't want to publish to the world.

Fortunately, you can check the identity of users in many ways. This section describes the most useful methods, starting with a simple unencrypted password check and moving to secure HTTP using public key cryptography.

User ID and Password

The most basic kind of authentication is a simple user ID and password check against a list of user IDs and passwords in a file. Users initially connect to a CGI script on the server that challenges them for a user ID and password. If a user enters a valid combination, the script displays another page or sends a HTTP redirect header to the browser to force it to load the other page.

This rather facile approach to authentication is weak for three reasons:

The following sections examine these reasons in detail.

Privacy
Given the openness of the Internet, you should assume that all transmissions-messages, Web pages, or (in this case) parts of a HTTP request-can be intercepted. If a transmission is sent in plain text and is intercepted, its contents become known to the person who intercepted it. If the transmission is encrypted in some way and is intercepted, the original content will not be known to the interceptor without substantial extra effort.

Encrypting the content of transmissions on the Internet is, therefore, a means of ensuring privacy. If a user ID and password are sent in plain-text mode, they might be used subsequently in a so-called impostor attack. This type of attack occurs when somebody other than the owner of the user ID-password pair attempts to use the pair to access the server. Encrypting the user ID and password before they are sent to the server protects them from the bad guys and helps ensure user authentication. Encryption by itself, however, does not provide authentication.

Verification
Assume that the user ID and password have been discovered by a person who should not have access to a service. This person may have made this discovery in any of several ways: intercepting a user ID-password pair sent as plain text; intercepting and successfully decrypting an encrypted user ID-password pair, although this event is unlikely; or watching the real owner of the pair type them, which is a more likely event. If no other checks are in place, this miscreant can then access the service from any Internet location.

You can encrypt transmissions in such a way that:

These methods are described in detail in "Public Key Cryptography" later in this chapter.

Manageability
The simple user ID-password methodology outlined in the preceding section is simple only if it is used to control access to a single location. If this method is extended to several CGI programs that form a single system, coordinating the activities of all the CGI programs can be difficult. Users may have authenticated themselves upon accessing one CGI program, but they will have to authenticate themselves again if they access the same program a second time (or access a different CGI program that forms part of the same system).

The basic idea outlined here, however, can be developed to the stage at which a single CGI script manages access to a set of other files, so that users need to validate themselves only one time. This type of system is examined in detail in Chapter 9, "Understanding CGI Security."

User ID and Password Summary
A simple user ID-password mechanism has flaws. The basic principle is sound, though: Users must say who they are (user ID) and then prove it (password). The mechanism is adequate as it stands for services in which security is a low priority, but it needs to be developed a little to allow for really secure transactions. The following section, "Public Key Cryptography," outlines the current best technology to achieve secure user authentication and describes some of the products that use it.

Public Key Cryptography

The main problem with the simple user ID-password schema outlined in the preceding sections is the fact that the origin of the user ID-password pair cannot be verified. The problem of adequate verification extends to many other areas on the Internet, including the contents of messages themselves. But in this case, you're concerned only with ensuring adequate verification of a user's identity before allowing that user to access your CGI scripts or Web pages.

Public key cryptography is a method of transmitting data from a sender to a recipient in such a way that nobody other than the recipient can receive the data and the recipient can be certain of the identity of the sender.

The basic idea underlying public key cryptography is the use of a pair of keys:

The public and private keys are derived simultaneously, using a special algorithm in such a way that messages encrypted with a person's public key can be decrypted only with that person's private key. Likewise, messages encrypted with a person's private key can be decrypted only with that person's public key.

Key Pairs
Suppose that I want to send a message to you, using public key cryptography. You have a public and a private key. You tell me your public key, perhaps by including it in your e-mail signature, but you keep your private key to yourself. This is what happens:

  1. I encrypt the original message text, using your public key, and then send the encrypted message to you.

  2. You receive the message in encrypted form and then decrypt it, using your private key.


The details of how messages are encrypted and decrypted with particular key strings are beyond the scope of this book.

This transmission is secure in the sense that anybody else who receives the message while it travels across the network in encrypted form will be unable to determine the original content of the message without knowing the value of your private key. In theory, someone could crack the code and read the message, but the amount of effort required runs into so many thousands of hours on a powerful computer that the possibility is a concern only if you are, say, a major world power. Even then, cracking a single transmission is of no use for cracking other transmissions if you change your key pair on a regular basis.

Certificate Authorities
The problem with the scheme described in the preceding section is the fact that your public key is unverified. How do I know that the public key really is your public key and not a public key generated by some impostor who wants to intercept messages to you? When public key cryptography is used to protect mail messages, this problem is not too serious-the impostor would have to establish e-mail communication with me for long enough to convince me that he is you, before he transmits the fake public key. This situation is possible, though.

The problem is much more acute when public key cryptography is used to automatically verify transmissions between two Internet hosts, such as a Web server and a client running a Web browser. The reason is that the public key is transmitted during the same dialogue as the transmission of the secure message that is encrypted with that key value. No opportunity exists to develop trust through a person-to-person dialogue, such as an exchange of e-mail messages.

That's where certificate authorities come into the picture. Certificate authorities are companies that are trusted to issue certificates to Internet users and to verify the contents of those certificates at a later stage.

A certificate contains the following information:

Certificates cost money and are issued at the request of the person or organization to which they refer. The certificates are used as a trusted point of referral by the recipient of an encrypted message, to verify that the sender really is who he or she claims to be.

In terms of the earlier example, here's how I would go about sending you an encrypted message, using certificate verification:

  1. I send a message to you, asking for your certificate.

  2. You send your certificate to me.

  3. I check with the certificate issuer to see whether the certificate is valid. Specifically, I want to know that the issuer really did issue the certificate on your behalf and that your public key is the same as the public key stated in the certificate.

  4. I receive confirmation from the certificate issuer that the certificate is valid.

  5. I encrypt the original message text, using your public key, and then send the encrypted message to you.

  6. You receive the message in encrypted form and then decrypt it, using your private key.

As is true of all Internet communications, the possibility always exists that someone will attempt to impersonate the entity with which you are dealing, so as to eavesdrop on your communications. When you deal with a certificate authority, that company's reputation is your guarantee. Attempting to impersonate a certificate authority brings tremendous wrath down on the head of any malefactor-a great deterrent to that form of impersonation.

Now suppose that you receive a message from me, encrypted with my private key, and you want to decrypt it while making sure that it really did come from me. This is what would happen:

  1. I encrypt the original message text, using my private key, and then send the encrypted message to you, along with details of my certificate.

  2. You receive the message in encrypted form.

  3. You check with the certificate issuer to verify that my public key is correct.

  4. You decode the message, using my public key.

Remember-all this works because of the unique properties of the public-private key pair. In this example, information on my public key is freely available and verifiable. Only messages that are encrypted with the corresponding private key can be decrypted with this public key. That fact means that nobody can fake messages from me without knowing my private key.

Public key cryptography is an algorithmic method. The algorithms that do the real work-deriving public and private keys, and encrypting and decrypting data-were developed by RSA Data Security, Inc. RSA does not produce any end-user software for performing authentication; instead, it licenses its algorithms to other companies for incorporation into their products.

To ensure secure communications between a server and a browser, you need both the server and the browser to execute these algorithms automatically, behind the scenes, acting under an agreed protocol. The next two sections, "Secure HTTP" and "Secure Sockets Layer," discuss two products that use RSA's public key cryptography technology to authenticate Web communications.

Secure HTTP
Secure HTTP (S-HTTP) is an extension of the HTTP protocol developed by Enterprise Integration Technologies (EIT); the National Center for Supercomputing Applications (NCSA); and RSA Data Security, Inc. S-HTTP uses public key cryptography to guarantee the authenticity of signed transmissions, allowing for comprehensive user verification. Although the S-HTTP protocol specification is public, the toolkit necessary to build applications that use it is a commercial product. S-HTTP has not yet become prominent on the Web.

Secure Sockets Layer
Netscape Communications has approached the authentication issue from a different angle. Netscape has licensed RSA's public key cryptography technology and used it to developed a security protocol called Secure Sockets Layer (SSL). This layer resides between TCP/IP (the communications layer) and HTTP (the applications layer). Netscape states that SSL will support other application protocols, such as NNTP, but that support has not materialized yet.

Netscape has developed another proprietary extension of the HTTP protocol to support SSL on Web servers. This extension is called https, and URLs that are to be delivered through SSL need to have the prefix https: instead of http. A Web server that supports SSL normally watches for http requests on port 80 and https requests on port 443. The use of two separate ports makes it possible for a server to communicate securely with clients that support https while providing normal, unauthenticated communications with other browsers.

User Verification Summary

The fields of cryptography and secure communications are much too vast to cover in detail in this book, and doing so wouldn't be appropriate anyway-this is a Perl book, after all. But understanding the different types of user authentication is important, especially if you're going to introduce user authentication on your Web server. The last few pages should be enough to give you a flavor for the various types of user authentication and the current trends in authentication technology.

If your server uses S-HTTP or SSL technology, authentication becomes a matter of server configuration, so your Perl programs don't need to concern themselves with it. If your server doesn't use either technology, you must provide authentication yourself, in your Perl programs. Your Internet Service Provider may not be able or willing to provide support of this kind on its server, for example. Chapter 9, "Understanding CGI Security," describes a method for implementing user authentication on a Web server entirely by means of Perl. This method works with or without an authentication-aware protocol such as https or S-HTTP.

If you need to be absolutely certain of the identity of anyone who is accessing your CGI/Perl programs, you have to use a certificate authority via S-HTTP, SSL, or some other method. If you want to make it difficult for people to fake their identity, a simple user ID-password system may be more appropriate. You may decide to combine the two approaches, requiring a user ID and password for your Perl script even if it runs on a secure server. Ultimately, the level of security that you choose depends on the sensitivity of your data and your estimate of the risk involved.

User Authentication on the Apache Server

Assuming that you can satisfactorily verify the identity of all users who attach to your server, you need to implement a strategy that gives users of your server enough access to do the things that you want them to be able to do, but not enough access to do the things that you don't want them to be able to do. This section describes the specific details of user authentication on the Apache server.


This section focuses on access restrictions for the Apache httpd server only; the chapter can't cover all features of all Web servers. Apache is fairly representative, being a superset of the NCSA httpd server. Apache also is an excellent piece of work and currently is the most popular Web server software in the world.

Access Restrictions

You can use two basic parameters to restrict access to a service:

Access can be restricted for one or more HTTP access methods (GET, PUT, POST, and so on) for users, groups, IP addresses, subnets, or a combination.

Apache Configuration Files

All aspects of configuration of the Apache server are controlled by a number of configuration files. Each file contains several configuration directives, each of which controls a specific aspect of Apache behavior in a specific directory tree. Table 8.1 lists the configuration files, in the order in which they are processed by the server. The default file specs shown in the table are relative to the server root directory.

Table 8.1-Apache Server Configuration Files
File Default File Spec Override With Controls
Server configuration conf/httpd.conf httpd's -d command-line switchServer daemon
Resource configuration conf/srm.conf ResourceConfig directive Document provision
Access configuration conf/access.conf AccessConfig directive Access permissions

Additional configuration directives can be stored in a special file in each directory to provide a fine level of access control. The per-directory configuration file is called .htaccess by default, but you can override this name with the AccessFileName directive (described in "Apache Server Configuration Directives" later in this chapter).

Filtering of Rights

The directives in the .htaccess files control server behavior with regard to files in the directory tree in which the .htaccess file is stored. Notice that the directives in a .htaccess file propagate through subdirectories. An attempt to access a file causes the server to look for a file called .htaccess in the directory in which the file is stored, in the parent directory of that subdirectory, in the parent's parent directory, and so on up to the server's document root directory. The .htaccess files found in this fashion are parsed in sequence, with directives in .htaccess files in lower-level subdirectories overriding directives in higher-level directories.

Realms, Users, and Groups

The information used to determine whether a user has access to a particular directory on the server is specific to the httpd server. The access-control mechanism used by the system on which httpd executes is not involved. So on UNIX systems, the contents of /etc/passwd are not relevant.

Instead, user information is stored in several user and group files, which can be either plain text or DBM files. Group definitions can be omitted if access is to be defined on a user-by-user basis.


User and group files should be stored in a location that is not exported by the Web server. Otherwise, users may be able to download them and thereby breach your server's security.

User and group definitions apply to a particular authorization realm. An authorization realm is a set of directories for which access rights are evaluated as a unit. The concept of authorization realms allows a user to access any directory in a designated set on the basis of a single authentication pass. This means that users are prompted for their user ID and password only one time during a session: the first time that they attempt to access a URL within the realm.

Configuration Delimiters

Configuration directives appear, one per line, in any of these configuration files. Directives can be grouped by means of the <Directory>...</Directory> and <Limit>...</Limit> delimiters, as follows:

<Directory /usr/local/projects>
<Limit GET>
Options FollowSymLinks
</Limit>
</Directory>
Directory groups can contain Limit groups, but no other nesting of delimiters is permitted. This means that Limit groups may not contain either Limit or Directory groups, and Directory groups may not contain Directory groups.

Configuration Directives

The authentication-related configuration directives for the Apache httpd are listed in tables 8.2 through 8.4. Directives related to server configuration are listed in Table 8.2; directives that can be used in local .htaccess files are listed in Table 8.3; and directory-specific configuration directives are listed in Table 8.4.

Notice that a certain amount of overlap occurs among these tables, because some directives can be used in more than one context. Those directives that are relevant to user authentication and access restriction are described in separate sections after each table. Refer to the Apache server documentation for detailed information on all directives, including the ones described in this chapter.

Apache Server Configuration Directives

Table 8.2 lists the directives that can be used in the server configuration files.

Table 8.2-Apache HTTPD Server Configuration Directives
Directive Argument Type Default Value Purpose
AccessConfig File name conf/access.conf Name of file containing access- control directives
AccessFileName File name .htaccess Name of per- directory access- control file
BindAddress IP address * (all IP addresses) IP address of server to listen on
DefaultType MIME type text/html Default type for documents with no MIME type specifier
DocumentRoot Directory name /usr/local/etc/ httpd/htdocs Name of top-level directory from which files will be served
ErrorDocument Error code - Specifies which document to return in the event of a given error code
ErrorLog File name logs/error_log Name of server error log file
GroupUnix Group ID #-1 Name or number of user group under which server will run
IdentityCheck on/off off Whether to try to log remote user names
MaxClients Number 150 Maximum number of clients that the server will support
MaxRequests PerChild Number - Maximum number of requests that the server will handle simultaneously for any one client
MaxSpare Servers Number 10 Maximum number of desired idle processes
MinSpare Servers Number 5 Minimum desired idle options
Options List of options - Defines which server features are allowed
PidFile File name logs/ httpd.pid Name of file where server daemon process ID is stored
Port Port number 80 Port number where server listens for requests
ResourceConfig File name conf/ srm.conf Name of file to read for server resource config- uration details
ServerAdmin E-mail address - E-mail address quoted by server when reporting errors to client
ServerName IP address - Server's host name
ServerRoot directory name /usr/local/ etc/httpd Name of directory where httpd is stored
ServerType inetd/ standalone standalone Whether to run as one process per HTTP connection (inetd) or one process to handle all connections (standalone)
StartServers Number 5 Number of child processes to create at startup
TimeOut Number 200 Maximum server wait time
User User ID #-1 User ID under which server will run

The server configuration directives that are relevant to user authentication are explained in the following sections.

AccessConfig
This directive overrides the default access configuration file specification, conf/access.conf, where access-control directives (such as directory-specific restrictions) are supposed to be stored. In fact, you can store these directives either in the access configuration file or in the resource configuration file.

The following directive in the server configuration file tells the server to read access_test.conf for directives instead of conf/access.conf:

AccessConfig conf/access_test.conf
You can tell the server not to look for an access configuration file by using the file spec /dev/null with the AccessConfig directive.

AccessFileName
Before the server sends any file to a client, it looks in the directory in which the file is stored for that directory's optional local configuration file. You can override the default file name, .htaccess, by using the AccessFileName directive.

Group
Use the Group directive in conjunction with the User directive to control the access rights of the server process. If you start the httpd server process as root, the Group and User directives cause the server to become the designated user in the designated group before answering any requests. By specifying a user ID and group that has access only to those files that you want to export onto the Web, you can avoid accidental exposure of sensitive information.

Apache recommends that you set up a special user ID and user group to run the server process. This user ID normally should have access only to the documents directory within the httpd directory tree (normally, /etc/local/http/htdocs). You may want to grant read access to the users' home directory tree as well if you want to allow your users to maintain Web material in their home areas.

IdentityCheck
Some Web clients run a daemon that allows the client to provide the user name of the remote user to the Web server on request. This identification is not secure and should not be taken seriously; it may be useful in some cases for crude access counts, but such counts will be incomplete, because most clients do not provide identification of this sort.

Setting the IdentityCheck directive to on instructs httpd to ask clients to identify the remote user and, if an identity is provided, to log this information in the server log file.

Options
The Apache httpd allows a great deal of control of the use of extra server features on a directory-by-directory level. The Options directive allows you to turn extra server features on for all directories (if used outside a Directory group) or for a specific directory (if used within a Directory group).

The Options directive takes any combination of the arguments in the following list and turns on the specific feature described by that argument. The directive has two special arguments: All turns on all extra server features, and None turns them all off.

ResourceConfig
This directive is quite similar to AccessConfig and is provided largely for backward compatibility. ResourceConfig overrides the default resource configuration file specification, conf/srm.conf, which is where resource control directives are supposed to be stored. In fact, you can store the directives either in the resource configuration file or in the server configuration file.

The following directive in the server configuration file tells the server to read srm_test.conf for directives instead of conf/srm.conf:

AccessConfig conf/srm_test.conf
You can tell the server not to look for a resource configuration file by using the file spec /dev/null with the ResourceConfig directive.

User
Use the User directive in conjunction with the Group directive to control the access rights of the server process. If you start the httpd server process as root, the Group and User directives cause the server to become the designated user in the designated group before answering any requests. By specifying a user ID and group that has access only to those files that you want to export onto the Web, you can avoid accidental exposure of sensitive information.

The argument to the User directive can be either a user ID or a user number preceded by a pound sign (#).

Apache recommends that you set up a special user ID and user group to run the server process. This user ID normally should have access only to the documents directory within the httpd directory tree (normally, /etc/local/http/htdocs). You may want to grant read access to the users' home directory tree as well if you want to allow your users to maintain Web material in their home areas.

Apache Directory Directives

Table 8.3 lists the directives that may be applied to individual directories. These directives are used in server configuration files to override default settings for a particular directory.

Table 8.3-Apache HTTPD Directory Authorization Directives
Directive Argument Type Default Value Purpose
allow from List of hosts - Allows access to this directory from the desig- nated IP hosts
deny from List of hosts - Denies access to this directory from the desig- nated IP hosts
order Evaluation order deny,allow Sets the order in which deny and allow directives are applied.
require user List of user IDs - List of IDs of users who can access a directory
require group List of groups - List of groups that can access a directory
require valid-user - - Allows access to all users who provide a valid user ID and password
AuthName Domain name - Name of authoriza- tion domain for a directory
AuthType Basic Basic Type of user authorization (only Basic available)
AuthUserFile File name - Name of text file containing list of users and passwords
AuthDBM UserFile File name - Name of DBM file containing list of users and passwords
AuthGroupFile File name - Name of text file containing list of user groups
AuthDBM GroupFile File name - Name of DBM file containing list of user groups
Options List of options - Defines which server features are allowed
Allow Override list All Specifies which directives can be overridden by local .htaccess file

The directory configuration directives that are relevant to user authentication are explained in the following sections.

allow from
Use the allow from directive to specify which IP hosts are allowed to access a given directory. This directive takes a series of host names as arguments, and allows access from each of the designated hosts. Host names may be fully qualified (as in bilbo.tolkien.org) or partially qualified (as in .tolkien.org). A partially qualified host name (such as tolkien.org) allows access from all hosts whose name ends in the string supplied (bilbo.tolkien.org, gandalf.tolkien.org, and so on).

Use the order directive (described later in this chapter) to determine the sequence in which the allow from and deny from directives are evaluated.

deny from
Use the deny from directive to specify which IP hosts are not allowed to access a given directory. This directive takes a series of host names as arguments and denies access to the directory from each of the designated hosts. Host names may be fully qualified (as in bilbo.tolkien.org) or partially qualified (as in .tolkien.org). A partially qualified host name (such as .tolkien.org) denies access from all hosts whose name ends in the string supplied (bilbo.tolkien.org, gandalf.tolkien.org, and so on).

Use the order directive (described in the following section) to determine the sequence in which the allow from and deny from directives are evaluated.

order
The allow from and deny from directives have opposite effects; they can be used in tandem to control exactly which IP hosts can and cannot access a particular directory. The order in which these directives are evaluated for a particular directory is significant, however.

Consider the effect on frodo.tolkien.org of allow from .tolkien.org followed by deny from frodo.tolkien.org. The net result is to allow access from all hosts in .tolkien.org except frodo. Now consider the effect of deny from frodo.tolkien.org followed by allow from tolkien.org. The net result in this case is to allow access from all hosts in tolkien.org, including frodo.

The order directive allows you to specify whether the allow from or deny from directives are evaluated first. The first argument is either allow,deny or deny,allow. In the first case, deny from directives can override allow from directives; in the second case, allow from directives can override deny from directives.

The following example allows access to frodo.tolkien.org but to no other hosts within the tolkien.org domain:

order deny,allow deny from .tolkien.org allow from .frodo.tolkien.org
require
Use the require directive to restrict access to a directory to one or more designated users. Any user who attempts to access a restricted directory is challenged; the user must provide a valid user ID and password before the server returns the requested URL.

The require directive can be used to restrict access in three distinct ways:

The following directive restricts access to users JohnB and DaveD only:

require user JohnB DaveD
This set of directives restricts access to the membership domain to members of the leaders group:

AuthType Basic
AuthName membership
AuthUserFile /www/staffmembers
AuthGroupFile /www/staffgroups
require group leaders
AuthName
Used with the AuthType, require, and AuthUserFile directives, the AuthName directive sets the authorization realm of the current directory when used inside a Directory group. For an example of using AuthName, see the section on the require directive earlier in this chapter.

AuthType
Apache currently has only one type of user authentication: Basic. The AuthType directive was introduced to allow for the anticipated introduction of other methods at a later stage.

Use the AuthType directive with the AuthName and require directives. For an example, see the require directive section earlier in this chapter.

AuthUserFile
Use the AuthUserFile directive to specify the name of the text file containing user IDs and passwords that is to be used to verify access to the current directory.

Each line of a user definition file contains a user ID, followed by a colon and a password encrypted with the crypt() function, as in the following example:

jeremiah:sn/A4bkdRjylI
ruth:1H.yzi5xcMPbk
This directive should be used in conjunction with the AuthName, AuthType, AuthGroupFile, and require directives.

AuthDBMUserFile
UNIX DBM files are a more efficient way than plain text files of storing user IDs and passwords. Use DBM files if you are dealing with more than a handful of users. Use the AuthDBMUserFile directive to specify the name of the DBM file containing user IDs and passwords that is to be used to verify access to the current directory. This directive should be used in conjunction with the AuthName, AuthType, AuthDBMGroupFile, and require directives.

AuthGroupFile
Use the AuthGroupFile directive to specify the name of the text file containing group definitions.

Each line of a group definition file contains a group name, followed by a colon and a list of the users in the group, as in the following example:

admin:  henry martha dave
AuthDBMGroupFile
The group file for a given directory may be a UNIX DBM file rather than a plain text file. If so, use the AuthDBMGroupFile directive, rather than the AuthDBMFile directive, to specify the group definition file.

Options
The Options directive described earlier in this chapter can also be used within a Directory group to control behavior for that directory. For details, refer to the "Options" section earlier in this chapter.

AllowOverride
Directives in the server configuration files can be overridden by directives in local .htaccess files, as described in the following section. As a server administrator, you may not want users to override all server directives. In such a case, use the AllowOverride directive in a Directory group to specify which directives can be overridden.

The default behavior is to allow the user to override all directives, which is the equivalent of using AllowOverride with an argument of All. Using an argument of None has the opposite effect, telling the server to ignore the contents of any .htaccess files. You can use the following arguments to fine-tune override behavior related to access control:

The directive AllowOverride Limit in the server configuration file, for example, allows local .htaccess files to control which hosts can access files.

Apache .htaccess Directives

Table 8.4 lists the directives that can be used in the local .htaccess files. These files override server configuration directives for the directory in which they reside.

Table 8.4-Apache httpd .htaccess Configuration Directives
Directive Argument Type Default Value Purpose
allow from List of hosts - Allows access to this directory from the desig- nated IP hosts
deny from List of hosts - Denies access to this directory from the desig- nated IP hosts
order Evaluation order deny,allow Sets the order in which deny and allow directives are applied.
require user List of user IDs - List of IDs of users who can access a directory
require group List of groups - List of groups that can access a directory
require valid-user - - Allows access to all users who provide a valid user ID and password
AuthGroupFile File name - Name of file containing list of user groups
AuthName domain name - Name of authoriza- tion domain for a directory
AuthType Basic Basic Type of user authorization (only Basic available)
AuthUserFile File name - Name of file containing list of users and passwords
Options List of options - Defines which server features are allowed in a given directory

These directives can be used in the .htaccess files as well as in the server configuration files. For details on each of these directives, refer to "Apache Directory Directives" earlier in this chapter.


Be careful not to give users too much leeway with .htaccess files. Local .htaccess directives can be used for purposes such as exporting files that would not otherwise be available. The best way to provide security is to use the AllowOverride directive in the server configuration file. The following example provides reasonable protection against accidental or deliberate security breaches:

<Directory>
AllowOverride None
Options None
<Limit GET PUT POST>
allow from all
</Limit>
</Directory>
This code prevents any overriding of directives by means of .htaccess files; explicitly turns off extra server features by means of the Options directive; and allows accesses from all hosts, but only by means of the GET, PUT, and POST methods.

User Administration

The extra layer of access control required on a Web server that uses user-related access restrictions has a certain amount of maintenance overhead. Aside from setting up the configuration files that define a realm (and the users and groups that have access to it), you need to be able to add and delete users in the various realms, change passwords for users who lose or forget them, and so on.

Fortunately, that task is just the kind of task for which Perl was brought into this world. The remainder of this chapter describes some sample Perl scripts that make it easy to administer the httpd's user accounts.


The code for the samples in this chapter is available on the CD-ROM that comes with this book. Copy these files into a directory in your path, and make sure that UserUtil.pl is in your Perl library directory (/usr/local/perl/lib, for example). This file contains the shared subroutines that do all the work.

Adding Users

The first task is to define a user, which means adding a line to a user file that contains the user name, a colon, and the user's password in encrypted format.

Encrypting Passwords

Getting the password into encrypted form is fairly straightforward when you use Perl. This task is one that you're going to want to perform again (in your script for setting passwords for existing users), so write a subroutine to do the job for you and then store it in UserUtil.pl, where it can be shared by several scripts.

Listing 8.1 shows the source for the GetPWord() subroutine. The subroutine takes no arguments, prompts the user for the password (twice, to prevent errors), and returns the encrypted password.

Listing 8.1-The GetPword Subroutine

# Subroutine to prompt for and return (encrypted) password.
sub GetPword {

    my ( $pwd1, $pwd2, $salt, $crypted );
    my @saltchars = (a .. z, A .. Z, 0 .. 9);

    print "Enter password: ";
    $pwd1 = <STDIN>
    chop($pwd1);
    length($pwd1) >= 8 ||
        die "Password length must be eight characters or more.\n";

    print "Enter the password again: ";
    $pwd2 = <STDIN>
    chop($pwd2);

    # Check that they match:
    ($pwd1 eq $pwd2 ) || die
        "Sorry, the two passwords you entered do not match.\n";

    # Generate a random salt value for encryption:
    srand(time || $$);
    $salt = $saltchars[rand($#saltchars)] . $saltchars[rand($#saltchars)];

    return crypt($pwd1, $salt);
}

The crypt() Function

In the UNIX world, the crypt() function looks after the job of encrypting passwords. The function takes two arguments:

The crypt() function applies an encryption algorithm to the password, using the salt value. Then the function returns the encrypted password, which consists of the salt value followed by 11 other characters. The password password, encrypted with the salt Xb, is Xbs.myqnmA.bI.

Decrypting a password from the encrypted form of the password is almost impossible, but comparing a given string with the password is easy.

The following list steps through the code to show you how it works:

  1. An array (@saltchars) of characters suitable for use in the salt value is defined.

  2. The user is prompted for a password, which is stored in $pwd1. If the password has fewer than eight characters, the subroutine dies.

  3. The user is prompted to re-enter the password. If the two passwords do not match, the program aborts with an error message.

  4. The random-number generator is seeded with a Boolean or combination of the current time and perl's process ID.

  5. A character from the set of salt characters is selected at random, using the expression $saltchars[rand($#saltchars)].

  6. A second character is similarly selected, and the two are combined with the . (period) concatenation operator. The result is the two-character encryption salt value.

  7. Finally, the password specified by the user and the salt value that you generated are passed to Perl's crypt() function. The value returned by crypt() is passed, intact, back to the calling subroutine.

Adding a User

Adding a user amounts to no more than adding a line that contains the user name and password to the user definition file. You can perform this task by using the SetPword() function, the code for which appears in Listing 8.2.

Listing 8.2-The SetPword Subroutine

# Store a user's password in a user definition file
# Arguments:
# - user file spec
# - user name
# - password
sub SetPword  {
    my( $filespec, $user, $pword ) = @_;

    # Open user file for appending:

    open(USERFILE, "+>>$filespec") ||
     die "Could not open user file \"$filespec\" for appending: $!\n";

    # Write to the user file
    print USERFILE "$user:$pword\n" ||
     die "Failed to write the user/password to file \"$filespec\".\n";

    # Tidy up:

    close USERFILE;
}
This code opens the named user definition file for appending by including the >> append operator in the file specification argument to the open() function. If the file does not already exist, perl creates it.

The code then writes the supplied user ID and encrypted password (with a colon between them and a new line at the end), closes the user definition file, and returns.

Putting It All Together: Aaddu

The script that you invoke when you actually want to add a user is relatively simple, because most of the work has been separated out into reusable subroutines. Listing 8.3 shows the code for Aaddu.

Listing 8.3-The Aaddu Script

#!/usr/local/bin/perl -I. -T

# Script to add a user to an Apache user file.

require "UserUtil.pl";  # Need utilities

# Takes two arguments:
# - username to add
# - file to add to

# Get the arguments:
($user, $file) = @ARGV;

# Check that we got two arguments:
$file ||
    die "Aaddu: Add user utility for Apache (text) user files.\n",
    "Usage: Aaddu username filespec\n";

$file =~ /(.+)/;
$safefile = $1;

# Get the encrypted password:
$password = &GetPword;

# Store the new username and password:
&SetPword($safefile, $user, $password);

# End
This script simply takes the user name and user definition file as arguments; gets the password interactively, using the GetPWord() subroutine; and then calls SetPWord() to add the users.

One more detail here. Examine the following mysterious lines:

$file =~ /(.+)/;
$safefile = $1;
These lines are here because the script turns on Perl's taint checking with the -T switch. In this mode, Perl does not allow you to pass an argument from the command line-namely, $file-to a function such as open(), because doing so might compromise security. Making a copy of $file won't work either, because it will be similarly tainted.

So how do you get the file name from $file in a way that won't upset Perl's taint-checking sensibilities? One way is to perform a regular expression match on the contents of $file and then store what was matched. Perl allows this method because it assumes that if you go to this much trouble in your own code, you know what you're doing.

The statement $file =~ /(.+)/ carries out a regular expression match on $file, using the expression (.+). This expression simply matches the entire contents of $file and returns what it found as $1. The script then stashes this result in the new variable $safefile. If you are writing scripts to be executed by other users, you may want to use a more elaborate regular expression to eliminate any suspicious characters from the variable before passing it to open.

Avoiding Duplicate User Names

The major difficulty with simply appending new users to a user definition file is that there is no safeguard against the possibility of adding the same user name more than once. The procedure would be much safer if the Aaddu script determined whether a user already existed in a user definition file before trying to add the user.

The UserDefined() function in UserUtil.pl makes just that determination. Listing 8.4 shows the code.

Listing 8.4-The UserDefined Subroutine

# Return 1 if user defined in named text file
sub UserDefined  {

    my ( $username, $filespec ) = @_;

    # No file, no user
    open(USERFILE, $filespec) || return 0;

    # Check each line for username:
    while (<USERFILE>)  {
     if ( /^$username:/ )  {
         close USERFILE;
         return 1;
     }
    }
    close USERFILE;
    return 0;
}
The function takes a user name and a user definition file specification as arguments; it returns 1 if the user exists in that file and 0 if the user doesn't exist. This function will also be useful in the opposite context when you want to change the passwords of existing users.

The operation of this function is quite straightforward: It opens the user definition file for reading, and checks each line in the file. If the line starts with the user name followed immediately by a colon, the function returns 1 to confirm that the user is defined.

Notice the use of the close() function before both return 1 and return 0. Placing a single close() statement at the end of a subroutine is not sufficient, because a return statement earlier in the subroutine may prevent the close() statement from being reached. It is, therefore, important to place close() statements immediately before every return point in the subroutine.

Listing 8.5 shows the new, improved Aaddu script.

Listing 8.5-The Aaadu Script with Duplicate Checking

#!/usr/local/bin/perl -I. -T

# Script to add a user to an Apache user definition file.
# Prevents duplicate entries.

require "UserUtil.pl";  # Need utilities

# Takes two arguments:
# - username to add
# - file to add to

# Get the arguments:
($user, $file) = @ARGV;

# Check that we got two arguments:
$file ||
    die "Aaddu: Add user utility for Apache user definition files.\n",
    "Usage: Aaddu username filespec\n";

$file =~ /(.+)/;
$safefile = $1;

# First check that the user does not already exist:
&UserDefined($user, $safefile) &&
    die "User \"$user\" already exists in file \"$safefile\".\n";

# Get the encrypted password:
$password = &GetPword;

# Store the new username and password:
&TextSetPword($safefile, $user, $password);

# End
The only change from the earlier version of Aaaddu is the addition of a call to UserDefined() to check for the existence of the user.

Deleting Users

Deleting users is a little less straightforward than adding them. Adding a user is simply a matter of sticking a new user line at the end of a file. Deleting a user, however, involves finding that user in the file and then rewriting the file without that user line but with all others left intact.

The simplest way to perform this task in Perl is to read the entire contents of the user definition file into an associative array, delete the entry that corresponds to the user that you want to drop, and then write the whole array out to the same file. This approach may not be immediately intuitive if you're not used to working with associative arrays, but it will become familiar to you in a short time, as you learn to leverage the power of associative arrays.

Listing 8.6 shows the code for the DeleteUser() subroutine.

Listing 8.6-The DeleteUser Subroutine

# Subroutine to delete a user from a user file
# Input: Username, filespec

sub DeleteUser  {

    my ($user, $filespec) = @_;
    my ($thisusr, $thispw, $elem, %passwords);

    # Open the file for reading:
    open(USERFILE, "$filespec") ||
        die "Could not open user file \"$filespec\" for reading: $!\n";

    # Grab the contents of the user file in an associative array:
    while (<USERFILE>)  {
        chop;
        ($thisusr, $thispw) = split(':', $_) ;
        $passwords{$thisusr} = $thispw;
    }
    close USERFILE;

    # Check that the named user exists:
    $passwords{$user} ||
        die "User \"$user\" not found in file \"$filespec\".\n";

    # Now delete the user from the array:
    delete $passwords{$user};

    # Now write the whole user/password array to the user file:

    # First re-open the user file for writing:
    open(USERFILE, ">$filespec") ||
        die "Could not open user file \"$filespec\" for reading: $!\n";

    # Now write each element of the array in the correct format:
    foreach $elem ( keys %passwords )  {
        print USERFILE $elem, ":", $passwords{$elem}, "\n" ||
            die "Failed to write user/password to file \"$filespec\": $!.\n";
    }

    close USERFILE;
}
The following list goes through this script a step at a time:

  1. The named user definition file is opened for reading.

  2. The contents of the file are read into the %passwords associative array by the following three lines, in a while(<>) loop:

    ***Production--The following code is part of this numbered list.***

    chop;
    ($thisusr, $thispw) = split(':', $_) ;
    $passwords{$thisusr} = $thispw;
  3. >chop statement drops the new-line character from each line as it is read in. The split() function breaks the line into the user name ($thisusr) and password ($thispw) components. Then a new entry is created in the %passwords associative array, with $thisusr as the key and $thispw as the value.

  4. With the associative array complete, the script throws away the entry that corresponds to the user that you want to drop, using the delete command.

  5. The script re-opens the user definition file for writing and iterates through all elements of the %passwords array, writing one at a time in the correct user:password format.

Changing Passwords

Changing the password of an existing user is trivial now; you've already written the code that does all the work. All you need is the following simple wrapper, Asetpw, to call the UserDefined() and SetPWord() subroutines, as shown in Listing 8.7.

Listing 8.7-The Asetpw Script

#!/usr/local/bin/perl -I. -T

# Script to change a password in an Apache user file.

require "UserUtil.pl";  # Need utilities

# Takes two arguments:
# - username to change
# - file containing userid, password

# Get the arguments:
($user, $file) = @ARGV;

# Check that we got two arguments:
$file ||
    die "Asetpw: Change password utility for Apache user definition files.\n",
    "Usage: Asetpw username filespec\n";

$file =~ /(.+)/;
$safefile = $1;

# First check that the user exists:
&UserDefined($user, $safefile) ||
    die "User \"$user\" does not exist in file \"$safefile\".\n";

# Get the encrypted password:
$password = &GetPword;

# Store the new username and password:
&SetPword($safefile, $user, $password);

# End
This listing illustrates just how useful a modular approach to code design can be.

Adding Users to Groups

A group definition file consists of a series of lines, one per group, each of which contains the name of the group, followed by a colon and a list of space-separated member names. This format is somewhat similar to the format of a user definition file, but the task of adding or deleting a user is more complex, because user definition lines have a single name and a single password-ideal material for an associative array. Group definition files, on the other hand, have a single group name and multiple member names.

You will still use associative arrays to deal with groups, but you need to do a little extra work to allow for the storage of a list of members as a single value. The approach that you take in this section is to deal with a group file as a whole, reading its contents to and from an associative array. This approach allows you to modularize your code into neat functional elements. This method lends itself particularly well to working with UNIX DBM files, should you decide to use them at a later stage.

Reading Groups

Listing 8.8 shows the source code for GetGroupMembers(), which is stored in UserUtil.pl. GetGroupMembers() is a subroutine that reads the entire contents of a group file into an associative array.

Listing 8.8-The GetGroupMembers Subroutine

# Subroutine to extract group member list from group file
# Input: file spec of group membership file
# Returns: Associative array of groups, members.

sub GetGroupMembers {
    my( $filespec ) = @_;
    my ($thisgrp, $grpmembers, %groupmembers);

    # Just return now if file does not exist:
    -e $filespec || return;

    # Open the group file:
    open(GFILE, "$filespec") ||
     die "Could not open user file \"$filespec\" for reading: $!\n";

    while (<GFILE>)  {
     chop;
     ($thisgrp, $grpmembers) = split(':' , $_);
     $groupmembers{$thisgrp} = $grpmembers;
    }

    close GFILE;

    return %groupmembers;
}
When the input file has been opened, each line is read and split at the colon into a group name ($thisgrp) and a member list ($grpmembers). The member list is a single string containing the user names of all group members, separated by spaces. The associative array %groupmembers is built by adding $thisgrp as a key and $grpmembers as a corresponding value for each line read from the file. Then the entire associative array is returned to the calling routine.

Writing Groups

Listing 8.9 shows the source code for SetGroupMembers(), which is stored in UserUtil.pl and which is similar to SetPword().

Listing 8.9-The SetGroupMembers() Subroutine

# Subroutine to store group member list in group file
# Input: file spec of group membership file,
#        associative array of groups/users

sub SetGroupMembers {
    my( $filespec, %groups ) = @_;
    my ($grp);

    # Open the group file:
    open(GFILE, ">$filespec") ||
     die "Could not open group file \"$filespec\" for writing: $!\n";

    foreach $grp ( keys %groups )  {
     print GFILE "$grp: $groups{$grp}\n";
    }

    close GFILE;
}
This function does the opposite of GetGroupMembers(): It opens the group definition file for writing and writes out one line per entry in the %groups associative array. Each line is written as the key, followed by a colon and then the corresponding value.

Putting It All Together: Agrpaddu

The source code for Agrpaddu is relatively short, making use of the functionality in UserUtil.pl, as shown in Listing 8.10.

Listing 8.10-The Agrpaddu Script

#!/usr/local/bin/perl -I. -T

# Script to add a user to a group in an Apache group file.

require "UserUtil.pl";  # Need utilities

# Takes two arguments:
# - group to add to
# - username to add
# - file to add to

# Get the arguments:
($group, $user, $file) = @ARGV;

# Check that we got three arguments:
$file ||
    die "Agrpaddu: Utility for adding users to Apache group files.\n",
    "Usage: Agrpaddu groupname username filespec\n";

# Extract filename:
$file =~ /(.+)/;
$safefile = $1;

# Read the current group membership into an associative array:
%groups = &GetGroupMembers($safefile);

# Check if user already in group:
$groups{$group} =~ /\b$user\b/ &&
    die "User \"$user\" is already a member of group \"$group\".\n";

# Add the user to the group:
$groups{$group} .= " $user";

# Write the array out to the groups file:
&SetGroupMembers($safefile, %groups);

# End
The following list describes what this script does:

  1. The script reads the designated group definition file into the %groups associative array.

  2. The script checks to see whether the user is already in the named group.

  3. If the user is not in the named group, the script appends a space and the user name to the value of the entry in %groups that has the group name for a key. (If the group was not defined, this step defines it.)

  4. The script writes the %groups associative array out to the group definition file.

Deleting Users from Groups

The task of deleting users from groups is a little more involved than adding them. Listing 8.11 shows the source for Agrpdelu.

Listing 8.11-The Agrpdelu Script

#!/usr/local/bin/perl -I. -T

# Script to delete a user from a group in an Apache group file.

require "UserUtil.pl";  # Need utilities

# Takes two arguments:
# - group to delete from
# - username to delete
# - file to delete from

# Get the arguments:
($group, $user, $file) = @ARGV;

# Check that we got three arguments:
$file ||
    die "Agrpdelu: Utility for deleting users from Apache group files.\n",
    "Usage: Agrpdelu groupname username filespec\n";

# Extract filename:
$file =~ /(.+)/;
$safefile = $1;

# Read the current group membership into an associative array:
%groups = &GetGroupMembers($safefile);

# Check if user is in group:
$groups{$group} =~ /\b$user\b/ ||
    die "User \"$user\" is not a member of group \"$group\".\n";

# First make an array of all members of this list:
(@oldmembers) = $groups{$group} =~ /(\w+)/g;

# Clear down the current member list for this group:
$groups{$group} = "";

# now add all but the member to be deleted to a new string:
foreach $member (@oldmembers)  {
    if ( $member ne $user )  {
     $groups{$group} .= " $member";
    }
}

# Write the array out to the groups file:
&SetGroupMembers($safefile, %groups);

# End
The following list shows the essential steps:

  1. The script reads the group definition file into the %groups associative array.

  2. The script stores the membership of the named group in the @oldmembers array, using this statement:

    ***Production--The following code is part of this numbered list.***

         (@oldmembers) = $groups{$group} =~ /(\w+)/g;
  3. worth examining. $groups{$group} is the string that contains all group member names, separated by spaces. Applying the /(\w+)/g operator performs a pattern match on the member list, saving all full words in $1, $2, and so on. Then these values are stored in @oldmembers.

  4. The script obliterates the group membership for the named group by setting it to an empty string.

  5. The script rebuilds the member list for the named group- $groups{$group}-by looping through all elements of the @oldmembers array, adding each element to $groups{$group} unless it matches the user name that is to be dropped.

  6. The script writes the %groups associative array back out to the group definition file.

Again, you get to reuse the GetGroupMembers() and SetGroupMembers() functions.

From Here...

This chapter describes how user authentication combines user verification with access restrictions to ensure that your server is as open as it needs to be, and no more. This book has a great deal more to say about server security and Perl:


Copyright © 1996, Que Corporation
Technical support for our books and software is available by email from
support@mcp.com
Copyright ©1996, Que Corporation. All rights reserved. No part of this book may be used or reproduced in any form or by any means, or stored in a database or retrieval system without prior written permission of the publisher except in the case of brief quotations embodied in critical articles and reviews. Making copies of any part of this book for any purpose other than your own personal use is a violation of United States copyright laws. For information, address Que Corporation, 201 West 103rd Street, Indianapolis, IN 46290.

Notice: This material is from Special Edition, Using Perl for Web Programming, ISBN: 0-7897-0659-8. The electronic version of this material has not been through the final proof reading stage that the book goes through before being published in printed form. Some errors may exist here that are corrected before the book is published. This material is provided "as is" without any warranty of any kind.

[Copyright Information] [Table of Contents] [Que Home Page]
[Prev Chapter] [Next Chapter]