Are those command-line arguments safe?

Are those arguments safe? For this, I’m using bash, but the example I show has the same problem in csh and zsh. I’ve known about this for a long time because I’d show off in Perl classes by creating weird filenames with special or weird characters. This stuff didn’t show up in the “Secure Programming Techniques” chapter, but it’s been on my to-do list for a long time.

Consider this one-liner the you might write to go through all of the files on command line to show the command-line arguments:

$ perl -le 'print for @ARGV' *

Seems reasonable, right? What could go wrong? So why do I get this output?

This is perl 5, version 30, subversion 1 (v5.30.1) built for darwin-2level

Copyright 1987-2019, Larry Wall

Perl may be copied only under the terms of either the Artistic License
or the GNU General Public License, which may be found in the Perl 5
source kit.

Complete documentation for Perl, including FAQ lists, should be found
on this system using "man perl" or "perldoc perl".  If you have access
to the Internet, point your browser at http://www.perl.org/, the Perl
Home Page.

Look at the list of files:

$ ls
-v		args.pl		barney.txt	fred.txt

There’s a file with a dash in it!

The shell expands the glob then interprets the entire command line:

$ perl -le 'print for @ARGV' *
$ perl -le 'print for @ARGV' -v args.pl barney.txt fred.txt

The -v looks like an option and is treated like an option. Once perl handles that -v, it exits. A typical example is a file named -rf waiting for someone to run rm.

This isn’t a big deal once I remember this. I rarely run into this so I hardly ever account for it, but rare, unaccounted things are rife for exploit.

From the shell, I can stop the options list with a double dash, --:

$ perl -le 'print for @ARGV' -- *
-v
args.pl
barney.txt
fred.txt

If I want those all to be filenames, I can glob slightly differently. Appending ./ to the glob specifies the current directory for each of the files. Since the first character is not -, it’s not an option:

$ perl -le 'print for @ARGV' ./*
./-v
./args.pl
./barney.txt
./fred.txt

I don’t want that leading ./, so I get rid of it:

$ perl -le 'print for @ARGV' ./* | perl -pe 's|\A\./||'
-v
args.pl
barney.txt
fred.txt

Bash’s GLOBIGNORE can help by removing patterns from the result of glob. I can ignore files that start with the dash:

$ ls *
-v      args.pl		barney.txt	fred.txt
$ export GLOBIGNORE="-*"
$ ls *
args.pl		barney.txt	fred.txt
$ perl -le 'print for @ARGV' *
args.pl
barney.txt
fred.txt

There are many other bash trick, but that’s not the point. These relies on the person running the command to know these things, which I find to be rare. Not only that, a person who wants to trick my program isn’t going to do these things.

Preventing it on purpose

Bump it up a notch. Even if I know enough to trap that word that looks like a command-line option, if I pass it to something else, it might turn into an option for a different command. Even though I’m using the safer, list form of system, I still have a problem:

$ perl -e 'system( q(/bin/echo), @ARGV )' foo
foo
$ perl -e 'system( q(/bin/echo), @ARGV )' -- -n foo
foo$

If I try the -- inside the echo, the -n word is nothing special (but -- shows up in the output:

$ perl -e 'system( q(/bin/echo), q(--), @ARGV )' -- -n foo
-- -n foo
$ echo -- -n foo
-- -n foo

The “Secure Programming Techniques” chapter discusses taint checking, which could (could!) catch this if you know to limit the arguments to words that don’t start with -. You might already check for characters you’ll allow in a filename. This seems reasonable at first:

#!perl -T
use v5.10;

delete $ENV{PATH};

my @args = map { /\A ( [a-z0-9-]+ ) \z/ix ? $1 : () } @ARGV;
say "args: @args";

system '/bin/echo', @args;

The output contains the -v because it only uses the allowed characters:

$ perl -T args.pl -v foo bar
args: -v foo bar

This demonstrates the limits of taint checking. It doesn’t prevent bad things happen, as many people think. It’s a tool that allows you to solve a small part of the problem of preventing bad things. Taint checking merely tells you when you forgot to check something, but doesn’t tell you how well you checked something. Use it poorly and bad things can still happen.

I can improve the pattern by limiting the characters that can show up in the first position:

#!perl -T
use v5.10;

delete $ENV{PATH};

my @args = map { /\A ( [a-z0-9] [a-z0-9-]* ) \z/ix ? $1 : () } @ARGV;
say "args: @args";

system '/bin/echo', @args;

But, having said that, I don’t want to continue if some of the input is invalid. There are times where I write a tool where I want to clean up input because users repeatedly make an innocent mistake (App::cpan does this because someone guessed that “install” was a command and then many people re-posted it). Other times, if any part of the input is invalid, I want to force the person to correct it. One way is to show the invalid arguments in the error:

#!perl -T
use v5.10;

delete $ENV{PATH};

my $pattern = qr/\A ( [a-z0-9] [a-z0-9-]* ) \z/xi;

my @invalid = grep { ! /$pattern/        } @ARGV;
die "Invalid argument: @invalid\n" if @invalid;

# now untaint
my @valid   = map  {   /$pattern/ and $1 } @ARGV;

say "args: @valid";

system '/bin/echo', @valid;

There’s no single feature that going to protect you from this. You need to think about the input and decide how you want to handle it.

More geeking out

David Wheeler (author of some really cool Perl stuff) recently updated Filenames and Pathnames in Shell: How to do it Correctly, which goes through it in all its gory details. I also ran across oilshell that solves this by not doing it that way.