Basic Perl security — taint checking

[ Perl tips index ]
[ Subscribe to Perl tips ]

Why and when security matters

There are times when programs need to give particular thought to security. Any program that is communicating with a potentially untrusted user, working with files or material supplied by external sources, or that deals with sensitive data should give particular thought to how things may go wrong.

It the world of security, prevention is much better than cure. The amount of effort required to clean-up after even a minor incident is usually much greater than the effort required to prevent that incident from occurring. While this article covers only a very small part of security, it also covers a security with regards to a very common operation -- opening files securely.

What is a security sensitive context?

A security sensitive context exists whenever special thought must be given to security. In particular, a security sensitive context exists in the following situations:

Throughout this article, we will assume that we are working in such a context, and that the data we are working with has been provided by an untrusted party. When dealing with security, paranoia is a virtue, and as such we wish our programs to treat all untrusted data as carefully as possible.

Taint checks

We all know the importance of validating our input. The old saying, ``Garbage In, Garbage Out'', reflects the truism that computer programs cannot provide meaningful output unless they are provided with meaningful input. When working securely, validating our input is more important than just being able to provide the correct results; it's also important to ensure that unexpected input does not cause our program to work in unintended and dangerous ways.

Perl has a special mode of operation known as taint mode. When in this mode, Perl will consider any data from the user, environment, or external sources (such as files) to be considered tainted, and unsuitable for certain operations. The philosophy behind taint is as follows:

You may not use data derived from outside your program to affect something else outside your program -- at least, not by accident.

Tainted data is communicable. This means that the result of any expression containing tainted data is also considered tainted. Just because the data has been reversed, mangled, and converted into a uuencoded string doesn't mean it's considered to be 'clean'. Tainting is applied at the scalar level, meaning that an array or hash may contain some elements that are tainted, and some that are clean.

Data that is considered tainted cannot be used to do any of the following:

Taint checks are automatically enabled when Perl detects that it's running with differing real and effective user or group ids -- which most commonly occurs when the program is running setid.

Taint mode can also be explicitly turned on by using the -T switch on the shebang line or command line.

        #!/usr/bin/perl -wT

It's highly recommended that taint mode be enabled for any program that's running on behalf of someone else, such as a CGI script or a daemon that accepts connections from the outside world. Once taint checks are enabled, they cannot be turned off.

Untainting data

Once tainted data has entered your program, whether from a user, a file or some other tainted source, it needs to first be untainted before it can be used for any potentially dangerous operation.

The only way to untaint data is to capture it from a regular expression match:

        # We expect our data to only contain word characters.
        # That is, letters, numbers, and underscores.
        unless( ($untainted) = ( $tainted =~ /^(\w+)$/ ) ) {
                die "Failed to untaint $tainted\n";
        # or, using $1
        unless( $tainted =~ /^(\w+)$/ )  {
                die "Failed to untaint $tainted\n";
        $untainted = $1;

Nothing prevents you from using (.*) to match everything, thus side-stepping the protection afforded by taint, however hopefully you'll think carefully before doing that.

When untainting data, concentrate on specifying what are valid values for that data rather than what are invalid. For example, it is much easier to require a filename to have one to eight word characters or dots, than it is to ensure it doesn't have directory separators, control characters, shell characters, whitespace, null bytes, or other unexpected data. Likewise, it is easier to check that a potential shell command argument only contains word characters and spaces than to eliminate all possible shell meta-characters.

A safety belt

Using taint does not guarantee that your program is secure; it only makes it more difficult for you to do something unsafe. In the vast majority of invocations of your program, your users will probably behave. They'll probably put letters in the name slots and numbers in the age slots. They will probably not try see if they can break your SQL or manipulate your server by clever crafting of entered data.

However, you only need one bad user to be successful for your machine to be compromised. Using taint and other security measures will help you make it harder for them to succeed.

Taint also has many applications even when security is not a concern. At a very fundamental level, using taint mode requires you to validate your input. This can assist greatly when testing and debugging your program at a later date, and can help reveal assumptions and limitations earlier in the development cycle.

[ Perl tips index ]
[ Subscribe to Perl tips ]

This Perl tip and associated text is copyright Perl Training Australia. You may freely distribute this text so long as it is distributed in full with this Copyright noticed attached.

If you have any questions please don't hesitate to contact us:

Phone: 03 9354 6001 (Australia)
International: +61 3 9354 6001

Valid XHTML 1.0 Valid CSS