I Invented the Question Mark: March 2014

How can I write shell code to run safely under cron ?

One of the problems running shell (ksh/bash) scripts under cron is suddenly finding that your brilliant script no longer behaves the same under cron as it did from the shell.

There are a number of causes of this and the most common are usually caused by incorrect assumptions about the environment the script will run in.

When writing a script to run under cron the PATH may be different from your PATH in your shell. Always set an explicit PATH at the beginning of the shell script.

e.g.
PATH=/usr/bin:/usr/local/bin

You can use export at the start of the line if you want, but cron will export a default PATH to begin with, meaning that you just need to change the value of it. On Solaris you may need to set the various paths for loading of dynamic libraries, if this is applicable.

Further (and sometimes fatal) problems will arise if you have environment variables in your shell environment that do not exist at all when running under cron.

Consider this scenario in your script:

cd $TMP; rm -rf *

On older systems that use / as the home directory for root this can cause the entire filesystem tree to be deleted if TMP is not defined - something I have seen happen because of this exact problem in someone's startup script. It is always a good idea for root to have its own home directory and not use / itself.

Always set the following option in your shell scripts:

set -u

This will cause the expansion of an undefined variable to halt execution of the program.

By the way, both of these techniques should be used in all your scripts, regardless of whether they are to be run from the command line or cron. Remember, the scripts you write may also be run by someone else and in these cases the environment may be sufficiently different to cause problems.

How to make tables from multiple files

pr -t -m /tmp/a /tmp/b

This will read a line from file 'a' and place it in the first column, followed by a line from file 'b' placed in the second column.

More files results in more columns. Be careful with long lines as individual columns may be truncated.

Solaris cat is Super-Fast!!!

A colleague asked someone at work to do a test of disk performance by cat-ing a 32GB file to /dev/null to determine why we had slow backups. It only took a fraction of a second - and he wondered why that could be the case.

So I looked into it. Firstly I used:

dd if=/path/to/largefile > /dev/null

to see if it exhibited the same behaviour as cat. It didn't. Then I truss-ed both cat and dd to find out what the difference was. I could see the data as an argument to the write system call in both processes, but it turns out that cat uses mmap to map chunks of the file into the process address space rather than using the read system call.

So why does this make it really quick to read the file ? Well, it doesn't. I tried it this way:

cat /path/to/largefile | cat > /dev/null

Now it's much slower, the speed is far more in line with what you would expect for reading a large file from a disk array.

So what is going on ? Well when you mmap a chunk of data into a process it's not really reading the data, it's just making it available to page in on demand. When the only thing you do with that data pointer is pass it to the write system call, and write is pointing its output to /dev/null, the kernel is just throwing away the pointer. Under normal circumstances if the data was written to stdout or another file it would be the write call that causes the data to be paged in - but since write is doing nothing, just returning, the data is never paged in.

But you say, if the data is not being read how could we have seen the data in the truss output ? Well, truss is causing a small amount of each mmap-ed segment to be read from disk - a tiny amount compared to the size of the mmap-ed part - just enough for truss to read a few bytes from the start of each pointed to block, so truss is causing the kernel to page in this small amount of data because it is truss that wants to display it - if we weren't truss-ing the cat command the file wouldn't be read at all. The fact that truss prints an exhaustive list of system calls used by cat to do this masks the slight slowdown that this small number of reads adds to the overall run-time of the cat command during truss-ing. So take away truss and yes, the process is very nearly instant.

Thursday, March 20, 2014

How can I write shell code to run safely under cron ?

How to make tables from multiple files

Solaris cat is Super-Fast!!!