Tuesday, June 1, 2010

Using Configuration Files With Shell Scripts

If you have worked with the command line for a while, you have no doubt noticed that many programs use text configuration files of one sort or another.  In this lesson, we will look at how we can control shell scripts with external configuration files.

Why Use Configuration Files?

Since shell scripts are just ordinary text files, why should we bother with additional text configuration files?  There are a couple of reasons that you might want to consider them:

  1. Having configuration files removes the need to make changes to a script.  There may be cases where you want to insure that a script remains in its original form.
  2. In particular, you may want to have a script that is shared by multiple users and each user has a specific desired configuration.  Using individual configuration files prevents the need to have multiple copies of the script, thus making administration easier.

Sourcing Files

Implementing configuration files in most programming languages is a fairly complicated undertaking, as you must write code to parse the configuration file's content.  In the shell, however, parsing is automatic because you can use regular shell syntax.

The shell builtin command that makes this trick work is named source.  The source command reads a file and processes its content as if it were coming from the keyboard.  Let's create a very simple shell script to demonstrate sourcing in action.  We'll use the cat command to create the script:

me@linuxbox:~$ cat > bin/cd_script
#!/bin/bash
cd /usr/local
echo $PWD


Press Ctrl-d to signal end-of-file to the cat command.  Next, we will set the file attributes to make the script executable:

me@linuxbox:~$ chmod +x bin/cd_script

Finally, we will run the script:

me@linuxbox:~$ cd_script
/usr/local
me@linuxbox:~$

The script executes and by doing so it changes the directory to /usr/local and then outputs the name of the current working directory which is /usr/local.  Notice however, that when the shell prompt returns, we are still in our home directory.  Why is this?  While it may appear at first that the script did not change directories, it did as evidenced by the output of the PWD shell variable.  So why isn't the directory still changed when the script terminates?

The answer lies in the fact that when you execute a shell script, a new copy of the shell is launched and with it comes a new copy of the environment.  When the script finishes, the copy of the shell is destroyed and so is its environment  As a general rule, a child process, such as the shell running a script, is not permitted to modify the environment of the parent process.

So if we actually wanted to change the working directory in the current shell, we would need to use the source command and to read the contents of our script.  Note that the name of the source command may be abbreviated as a single dot followed by a space.

me@linuxbox:~$ . cd_script
/usr/local
me@linuxbox:/usr/local$

By sourcing the file, the working directory is changed in current shell as we can see by the trailing portion of the shell prompt.  Be aware that, by default, the shell will search the directories listed in the PATH variable for the file to be read.  Files that are read by source do not have to be executable, nor do they need to start with the shebang (i.e. #!) mechanism.

Implementing Configuration Files In Scripts

Now that we see how sourcing works, let's try our hand at writing a script that uses a the source command to read a configuration file.

In part 4 of the Getting Ready For Ubuntu 10.04 series, we wrote a script to perform a backup of our system to an external USB disk drive.  The script looked like this:

#!/bin/bash

# usb_backup # backup system to external disk drive

SOURCE="/etc /usr/local /home"
DESTINATION=/media/BigDisk/backup

if [[ -d $DESTINATION ]]; then
    sudo rsync -av \
        --delete \
        --exclude '/home/*/.gvfs' \
        $SOURCE $DESTINATION
fi 

You will notice that the source and destination directories are hard-coded into the SOURCE and DESTINATION constants at the beginning of the script.  We will remove these and modify the script to read a configuration file instead:

#!/bin/bash

# usb_backup2 # backup system to external disk drive

CONFIG_FILE=~/.usb_backup.conf

if [[ -f $CONFIG_FILE ]]; then
        . $CONFIG_FILE
fi

if [[ -d $DESTINATION ]]; then
        sudo rsync -av \
                --delete \
                --exclude '/home/*/.gvfs' \
                $SOURCE $DESTINATION
fi

Now we can create a configuration file named ~/.usb_backup2.conf that contains these two lines:

SOURCE="/etc /usr/local /home"
DESTINATION=/media/BigDisk/backup

When we run the script, the contents of the configuration file is read and the SOURCE and DESTINATION constants are added to the script's environment just as though the lines were in the text of the script itself.  The

if [[ -f $CONFIG_FILE ]]; then
        . $CONFIG_FILE
fi

construct is a common way to set up the reading of a file.  In fact, if you look at your ~/.profile or ~/.bash_profile startup files, you will probably see something like this:

if [ -f "$HOME/.bashrc" ]; then
    . "$HOME/.bashrc"
fi

which is how your environment is established when you log in at the console.

While our script in its current form requires the configuration file to define the SOURCE and DESTINATION constants, it's easy to make the use of the file optional by setting default values for the constants if the configuration file is either missing or does not contain the required definitions.  We will modify our script to set default values and also support an optional command line option (-c) to specify an optional, alternate configuration file name:

#!/bin/bash

# usb_backup3 # backup system to external disk drive

# Look for alternate configuration file
if [[ $1 == -c ]]; then
    CONFIG_FILE=$2
else
    CONFIG_FILE=~/.usb_backup.conf
fi

# Source configuration file
if [[ -f $CONFIG_FILE ]]; then
    . $CONFIG_FILE
fi

# Fill in any missing values with defaults
SOURCE=${SOURCE:-"/etc /usr/local /home"}
DESTINATION=${DESTINATION:-/media/BigDisk/backup}

if [[ -d $DESTINATION ]]; then
    sudo rsync -av \
        --delete \
        --exclude '/home/*/.gvfs' \
        $SOURCE $DESTINATION
fi

Code Libraries

Since the files read by the source command can contain any valid shell commands, source is often used to load collections of shell functions into scripts.  This allows central libraries of common routines to be shared by multiple scripts.  This can make code maintenance considerably easier.

Security Considerations

On the other hand, since sourced files can contain any valid shell command, care must be take to make sure that nothing malicious is placed in a file that is to be sourced.  This holds especially true for any script that is to be run by the superuser.  When writing such scripts, make sure that the super user owns the file to be sourced and that the file is not world-writable.  Some code like this could do the trick:

if [[ -O $CONFIG_FILE ]]; then
    if [[ $(stat --format %a $CONFIG_FILE) == 600 ]]; then
        . $CONFIG_FILE
    fi
fi

Further Reading

The bash man page:
  • BUILTIN COMMANDS (source command)
  • CONDITIONAL EXPRESSIONS (testing file attributes)

The Linux Command Line
:
  • Chapter 35 - Strings And Numbers (parameter expansions to set default values)