I have a new shell lecture in ../New-Sh.

Scripts and Utilities -- Sh lecture

This file: http://www.cs.utk.edu/~plank/plank/classes/cs494/494/notes/Sh/lecture.html

This lecture will cover basic mechanics of using the Bourne shell (/bin/sh). Although most people use the csh as an interative interpreter to execute programs, I find the Bourne shell to be much simpler for writing simple programs.

The man page for the Bourne shell (``man sh'')is excellent. After this lecture, you should be able to read it without problems, in order to learn things not covered in this lecture.

#!/bin/sh, protection, simple commands

To write a shell script, the first line of your program should be ``#!/bin/sh''. Moreover, the protection mode of your program should have the executable bit set. After the first line, you may execute programs pretty much just like in the csh. For example, the lshome program prints your home directory, and then lists its contents. Try it out:

UNIX> ls -l lshome
-rwxr-xr-x  1 plank          21 Jun  2 10:24 lshome
UNIX> cat lshome
#!/bin/sh

cd
pwd
ls
UNIX> lshome
/mahogany/homes/plank
BLACS                   driverfile              picks
Jumpstart               fball                   pics
LU.dat                  flight                  process_trace
...
UNIX>

You can also run the shell script by typing ``sh lshome'' (and then you don't need the ``!#/bin/sh'' line). Doing ``sh -x script'' can help you debug a shell script.

Indirection, pipes

Indirection and pipes should be second nature to you by now:

> f: Writes standard output to the file f.
>> f: Appends standard output to the file f.
< f: Takes standard input from the file f.
2> f: Writes standard error to the file f.
2>&1: Writes standard error to wherever standard out is going when it reaches this statement.
>&2: Writes standard output to wherever standard error is going when it reaches this statement.

So, suppose the file f1 contains the bytes ``This is f1'', and suppose that f3 does not exist. Then what happens when you do the following?

$ cat < f1
$ cat f1 f3 > f2
$ cat f1 f3 2> f2
$ cat f1 f3 2>&1 > f2
$ cat f1 f3 > f2 2>&1 
$ cat f1 f3 >&2 2> f2
$ cat f1 f3 2> f2 >&2
$ cat f1 f3 2>&1 >f2 | cat > f5

Make sure you understand the output of each of these (when you test it, make sure you're running sh and not csh).

Pipes pipe standard output of one command into standard input of another. Again, I assume that this is something you already know. For example, a simple way of printing the 5th line of the file f is to do the following:

UNIX> head -5 f | tail -1

(Yes, there are better ways of doing that).

Like the csh, the Bourne Shell waits for commands to finish before continuing on. To execute a command and not wait until it finishes, append an ampersand to the command. E.g:

$ xterm &

This executes the xterm program, but lets you continue without waiting for the xterm program to finish.

Combining commands: semi-colon and ()

Besides pipes, there are two other ways that the shell lets you combine commands. First is the semi-colon -- this allows you to combine multiple commands on one line. Thus lshome2 is really the same as lshome, except all three commands are on one line.

In and of itself, this is not very exciting. However, it becomes more powerful when you combine it with parentheses. With parentheses, you combine multiple commands and execute them in a sub-shell. For example, suppose you would like to time how long it takes to ping a machine at princeton. One way to do this is to use the time command. However, a more primitive way is to simply call date before and after the command. If you combine them all on one line with semi-colons, then you get a fairly accurate timing:

UNIX> sh
$ date ; /usr/etc/ping www.cs.princeton.edu ; date
Fri Jun  6 08:56:44 EDT 1997
engram.CS.Princeton.EDU is alive
Fri Jun  6 08:56:46 EDT 1997
$

Now, suppose you'd like the output of the three commands to go to a file. Of course, one way is to redirect each command to the output file:

$ date > out ; /usr/etc/ping www.cs.princeton.edu >> out ; date >> out

Another way is to bundle up the three commands inside parentheses, and redirect the output of the composite command to the file:

$ ( date ; /usr/etc/ping www.cs.princeton.edu ; date ) > out

You can also use parentheses to put a composite command into the background:

$ ( date ; /usr/etc/ping www.cs.princeton.edu ; date ) > out &

echo

echo is a shell command that simply prints its arguments on standard output. For example:

$ echo Jim
Jim
$ echo Jim    Plank
Jim Plank
$ echo

$

Note that echo separates multiple arguments by a single space.

Filename expansion: * and ?

You can type any string into the shell, but certain characters are special (like '>', '<', '|', etc). You should already know about the star -- this expands as a wildcard to match any filename. E.g. ``echo *'' echos all filenames in the current directory (excepting those that start with '.'). ``echo lshome*'' echos both ``lshome'' files.

$ echo *
bq1 bq2 count doubleprintarg1 f1 f2 ifcat input1 lecture.html logo.gif lshome lshome2 outline printarg1 setexample simple sortword specialvar testfor testfor2 testforone tfo2 whatsmyname wmn2
$ echo lshome*
lshome lshome2
$

The question mark is a wild card that will match any one character. Thus, ``echo ??'' will print out all filenames in the current directory that are composed of two characters, and ``echo lshome?'' will print out ``lshome2'', since the question mark must match one character:

$ echo ??
f1 f2
$ echo lshome?
lshome2
$

Single and double quotes

Quotes allow you to:

Use special characters in strings
Bundle up multiple words into one argument

Single quotes allow you to use most any character in a string. For example, you can use *, ?, (, ), >, <, |, ", $, & and space in single quotes without having the shell do anything special to them:

$ echo Do you have $100 (so I can borrow it?)
syntax error: `(' unexpected
$ echo 'Do you have $100 (so I can borrow it?)'
Do you have $100 (so I can borrow it?)
$ echo 'Hey   'Jim'!'
Hey   Jim!
$ echo 'Hey   ' Jim'!'
Hey    Jim!
$ echo 'Hey   '               Jim'!'
Hey    Jim!

Double quotes are less powerful. They work like single quotes except $ gets expanded (see below). You can build strings simply by concatenating them as above. To get a single quote in a string, you need to use double quotes, and to get a double quote in a string, you need to use single quotes. For example, to get the string "'", you do the following:

$ echo '"'"'"'"'
"'"
$ echo 'She said "This is Jim'"'s course!!"'"'
She said "This is Jim's course!!"

Differences between csh and the Bourne shell

There are many differences between the Bourne shell and csh. Some major ones are that the Bourne shell has no command aliasing, no history, no '~' expansion, no 'setenv', and slightly different syntax for redirection (for example, there is no ``2>&1''. Instead, if you want standard output and standard error to go to the same place, you use ``>&''.

There are other major differences in expression syntax, etc. Since I don't ever program with csh, I don't know the differences. However, in general, you can't use a csh on a Bourne shell script.

Fortunately, the use of single and double quotes is nearly identical in both the Bourne shell and csh.

Environment variables

Environment variables are a simple associative matching between names and strings. The Bourne shell will inherit the environment variables in its calling environment (i.e. any setenv's that you have done in the csh), and it lets you set your own.

Environment variables are expanded by using the dollar sign. For example, your home directory and user name are always in the environment variables HOME and USER respectively:

$ echo $HOME
/mahogany/homes/plank
$ echo $USER
plank
$

As stated above, environment variables are expanded in double quotes, but are not in single quotes. You can use quotes to build strings out of environment variables.

$ echo "$HOME"
/mahogany/homes/plank
$ echo '$HOME'
$HOME
$ echo $HOME$USER
/mahogany/homes/plankplank
$ echo $HOME $USER
/mahogany/homes/plank plank
$ echo "$HOME  "bigjim$USER
/mahogany/homes/plank  bigjimplank
$ echo "$USER's home directory is $HOME"
plank's home directory is /mahogany/homes/plank
$

When you're running a shell script, you can get at the command line arguments using $1, $2, $3, up to $9. For example, printarg1 prints out the first command line argument:

UNIX> printarg1 1
1
UNIX> printarg1 Jim Plank
Jim
UNIX> printarg1 "Jim Plank"
Jim Plank
UNIX>

When programming with the Bourne shell, you often have to be careful to use double quotes whenever you may get a space in a string. For example, look at doubleprintarg1. Make sure you understand why the output of doubleprintarg1 below is as it is:

UNIX> doubleprintarg1 "Jim Plank"
Jim
Jim Plank
UNIX>

To set an environment variable, you do ``var=string''. Note that there should be no space between the equals sign and the var and string. In setting the environment variable, the only way that you can use spaces in the string is to enclose it in quotes. Examples:

$ a=Jim
$ b=Plank
$ echo $a $b
Jim Plank
$ c="Jim Plank"
$ echo $c
Jim Plank
$ d=$a $b
Plank: not found
$ d="$a $b"
$ echo $d
Jim Plank
$ d=$d$d
$ echo $d
Jim PlankJim Plank
$

Special environment variables: $#, $$, $! $*

There are a few special environment variables:

$# is the number of command line arguments.
$$ is the process id of the shell script.
$* is a string containing all of the command line arguments
$! is the process id of the last process that the shell script has executed in the background.

Look at specialvar, and make sure you understand the output when executed with the following:

UNIX> specialvar
Number of arguments: 0
Process id: 4699
Arguments: 
Forked process pid: 4700
UNIX> specialvar a1 a2 a3
Number of arguments: 3
Process id: 4701
Arguments: a1 a2 a3
Forked process pid: 4702
UNIX> specialvar GIVE     HIM    SIX
Number of arguments: 3
Process id: 4705
Arguments: GIVE HIM SIX
Forked process pid: 4706
UNIX>

If statements

The syntax of if statements is:

if bool
  then
     statements
fi

Usually, you use a semicolon and put the then on the same line:

if bool ; then
     statements
fi

You can have an else clause, and any number of elif clauses. Now, what is the boolean statement? Unix processes return a number to their caller (for those who have taken CS360, this is exit value returned by the wait() system call). The boolean statement is a Unix command, and if it returns zero, then the then part of the clause is executed. Otherwise, the next elif or else clause is executed.

For example, cat returns 0 if it runs successfully, and 1 if it encounters an error. Look at ifcat:

UNIX> cat ifcat
#!/bin/sh

if cat $1 > /dev/null 2>&1 ; then 
  echo "cat $1 worked just fine"
else
  echo "cat $1 returned with an error"
fi
UNIX>

This executes cat on the argument, and uses the exit value of cat to report whether it was successful. Try it out:

UNIX> ifcat f1
cat f1 worked just fine
UNIX> ifcat /usr/dict/words             
cat /usr/dict/words worked just fine
UNIX> ifcat no-such-file
cat no-such-file returned with an error
UNIX>

The Test Program

There is a program called test whose purpose is to evaluate boolean functions. Do man test to learn the complete syntax. I'll go over a few:

= and !=: String equivalence
-eq: numerical equality
-gt, -ge, -lt and -le: numerical comparison
-f and -d: file and directory existence
-a, -o and !: boolean and, or and not.

As an example, whatsmyname is a simple shell script that has you guess my name on the command line:

UNIX> whatsmyname Jim
Right!
UNIX> whatsmyname James
Right, although I prefer to be called Jim
UNIX> whatsmyname Peyton
Nope
UNIX> whatsmyname Frank
"Frank Plank" -- Are you kidding?!?!?!?!
UNIX>

To improve readability, the Bourne shell lets you enclose your arguments to test in square brackets. Then if statements look much better. wmn2 is pretty much just like whatsmyname, except that it uses the square brackets, and handles some conditions better than whatsmyname:

UNIX> whatsmyname
whatsmyname: test: argument expected
UNIX> wmn2
usage: wmn2 name
UNIX> wmn2 Jim
Right!
UNIX> whatsmyname Jim Plank
Right!
UNIX> wmn2 Jim Plank
usage: wmn2 name
UNIX> whatsmyname "Jim Plank"
whatsmyname: test: unknown operator Plank
UNIX> wmn2 "Jim Plank"
Nope
UNIX>

While

While's syntax is similar to if's:

while bool ; do
     statements
done

Shift

Shift is a simple command that shifts the command line arguments by one. The $# and $* variables are changed as well. For example, testforone uses shift in a while loop to test if any of its command line arguments equal one. Note that the numerical value of a string is its atoi() value. This is an integer, not a floating point number.

UNIX> testforone 
UNIX> testforone 3 2 1 1.0 1.5 "1 Jim" "Jim Plank"
No:  3 does not equal 1
No:  2 does not equal 1
Yes: 1 equals 1
Yes: 1.0 equals 1
Yes: 1.5 equals 1
Yes: 1 Jim equals 1
No:  Jim Plank does not equal 1
UNIX>

Set

The set command lets you set the command line arguments to something else. This can be very convenient. As an example, look at setexample. It sets the command line arguments and then prints them out one by one with a while/shift loop. Note the use of quotes in "Rocky Top".

UNIX> setexample
Once I Had a Girl on Rocky Top
Once
I
Had
a
Girl
on
Rocky Top
UNIX> setexample XXX OOO
Once I Had a Girl on Rocky Top
Once
I
Had
a
Girl
on
Rocky Top
UNIX>

For

You can do for loops in one of two ways:

for var do
  statements
done

for var in strings ; do
  statements
done

The former way loops through all command line arguments, each time setting the var to be the argument. The latter way loops through each string, each time setting the var to be the string.

For example, tfo2 uses the first kind of for loop to implement a program equivalent to the testforone program:

UNIX> tfo2
UNIX> tfo2 3 2 1 1.0 1.5 "1 Jim" "Jim Plank"
No:  3 does not equal 1
No:  2 does not equal 1
Yes: 1 equals 1
Yes: 1.0 equals 1
Yes: 1.5 equals 1
Yes: 1 Jim equals 1
No:  Jim Plank does not equal 1
UNIX>

For example, testfor shows a very simple example of the second kind of for loop:

UNIX> testfor
Once
I
had
a
girl
on
Rocky Top
UNIX>

Line Continuations

You can end a line with a backslash to continue it to the next line, as in testfor2.

Backquotes

Backquotes are very important in the Bourne Shell. They execute the command in the backquotes, and then treat the output like a list of strings. If the output is on multiple lines, it is treated as one big list of strings with no newlines. For example, bq1 takes a file as its argument and prints out the content of the file all on one line:

UNIX> bq1 f1
This is f1
UNIX> bq1 bq1
#!/bin/sh if [ $# -ne 1 -o ! -f "$1" ]; then echo "usage: bq1 filename" >& 2 exit 1 fi b=`cat "$1"` echo $b
UNIX>

And bq2 takes a file as its argument and prints out each word on its own line:

UNIX> bq2 f1
This
is
f1
UNIX> bq2 input1
Once
I
had
a
girl
on
Rocky
Top
Half
bear
the
other
have
cat
Mean
as
a
snake
but
sweet
as
soda
pop
I
still
think
about
that

You can use parentheses around larger blocks of code to get some nice effects. For example, sortword tweaks bq2 to sort the words in a file. How long would it have taken you to write that in C?

UNIX> sortword input1
Half
I
Mean
Once
Rocky
Top
a
about
as
bear
but
cat
girl
had
have
on
other
pop
snake
soda
still
sweet
that
the
think
UNIX>

bc

bc is a simple infix calculator that often gets used to do induction variables in sh. For example, inc prints out the command line argument plus one using bc:

UNIX> inc 4
5
UNIX>

Note that using bc in loops is pretty slow (for example, the square program that you'll write as part of your lab is much, much slower than a C version would be). That's because each time you call bc and test you're forking off a new process. This is expensive compared to doing everything in a C program. That's why it's best to think of shell scripts as something that you write when efficiency of writing the program is more important than efficiency of the program itself.

expr is another program that lets you do math in shell scripts. It's nicer than bc because it lets you specify the arguments on the command line, and it performs math, logical arithmetic, and string manipulation. However, it does not do floating point arithmetic. Read the man page for more information. For example, einc is just like inc except it uses expr instead of bc. You'll notice that you cannot increment decimal numbers with einc, but you can with inc.

Other stuff

You can read about other stuff in the man page for sh. Other things that you may want to know about are:

The << indirection command
The wait command
The read command
The -c command line option
The case statement
Function definition
$? and $-