Scripts and Utilities -- Sh lecture


  • Jim Plank
  • Directory: /home/cs494/notes/Sh
  • This file: http://www.cs.utk.edu/~plank/plank/classes/cs494/494/notes/Sh/lecture.html
  • Email questions and answers
    This lecture will cover basic mechanics of using the Bourne shell (/bin/sh). Although most people use the csh as an interative interpreter to execute programs, I find the Bourne shell to be much simpler for writing simple programs.

    The man page for the Bourne shell (``man sh'')is excellent. After this lecture, you should be able to read it without problems, in order to learn things not covered in this lecture.


    #!/bin/sh, protection, simple commands

    To write a shell script, the first line of your program should be ``#!/bin/sh''. Moreover, the protection mode of your program should have the executable bit set. After the first line, you may execute programs pretty much just like in the csh. For example, the lshome program prints your home directory, and then lists its contents. Try it out:
    UNIX> ls -l lshome
    -rwxr-xr-x  1 plank          21 Jun  2 10:24 lshome
    UNIX> cat lshome
    #!/bin/sh
    
    cd
    pwd
    ls
    UNIX> lshome
    /mahogany/homes/plank
    BLACS                   driverfile              picks
    Jumpstart               fball                   pics
    LU.dat                  flight                  process_trace
    ...
    UNIX>
    
    You can also run the shell script by typing ``sh lshome'' (and then you don't need the ``!#/bin/sh'' line). Doing ``sh -x script'' can help you debug a shell script.

    Indirection, pipes

    Indirection and pipes should be second nature to you by now: So, suppose the file f1 contains the bytes ``This is f1'', and suppose that f3 does not exist. Then what happens when you do the following?
    $ cat < f1
    $ cat f1 f3 > f2
    $ cat f1 f3 2> f2
    $ cat f1 f3 2>&1 > f2
    $ cat f1 f3 > f2 2>&1 
    $ cat f1 f3 >&2 2> f2
    $ cat f1 f3 2> f2 >&2
    $ cat f1 f3 2>&1 >f2 | cat > f5
    
    Make sure you understand the output of each of these (when you test it, make sure you're running sh and not csh).

    Pipes pipe standard output of one command into standard input of another. Again, I assume that this is something you already know. For example, a simple way of printing the 5th line of the file f is to do the following:

    UNIX> head -5 f | tail -1
    
    (Yes, there are better ways of doing that).

    Like the csh, the Bourne Shell waits for commands to finish before continuing on. To execute a command and not wait until it finishes, append an ampersand to the command. E.g:

    $ xterm &
    
    This executes the xterm program, but lets you continue without waiting for the xterm program to finish.

    Combining commands: semi-colon and ()

    Besides pipes, there are two other ways that the shell lets you combine commands. First is the semi-colon -- this allows you to combine multiple commands on one line. Thus lshome2 is really the same as lshome, except all three commands are on one line.

    In and of itself, this is not very exciting. However, it becomes more powerful when you combine it with parentheses. With parentheses, you combine multiple commands and execute them in a sub-shell. For example, suppose you would like to time how long it takes to ping a machine at princeton. One way to do this is to use the time command. However, a more primitive way is to simply call date before and after the command. If you combine them all on one line with semi-colons, then you get a fairly accurate timing:

    UNIX> sh
    $ date ; /usr/etc/ping www.cs.princeton.edu ; date
    Fri Jun  6 08:56:44 EDT 1997
    engram.CS.Princeton.EDU is alive
    Fri Jun  6 08:56:46 EDT 1997
    $
    
    Now, suppose you'd like the output of the three commands to go to a file. Of course, one way is to redirect each command to the output file:
    $ date > out ; /usr/etc/ping www.cs.princeton.edu >> out ; date >> out
    
    Another way is to bundle up the three commands inside parentheses, and redirect the output of the composite command to the file:
    $ ( date ; /usr/etc/ping www.cs.princeton.edu ; date ) > out
    
    You can also use parentheses to put a composite command into the background:
    $ ( date ; /usr/etc/ping www.cs.princeton.edu ; date ) > out &
    

    echo

    echo is a shell command that simply prints its arguments on standard output. For example:
    $ echo Jim
    Jim
    $ echo Jim    Plank
    Jim Plank
    $ echo
    
    $
    
    Note that echo separates multiple arguments by a single space.

    Filename expansion: * and ?

    You can type any string into the shell, but certain characters are special (like '>', '<', '|', etc). You should already know about the star -- this expands as a wildcard to match any filename. E.g. ``echo *'' echos all filenames in the current directory (excepting those that start with '.'). ``echo lshome*'' echos both ``lshome'' files.
    $ echo *
    bq1 bq2 count doubleprintarg1 f1 f2 ifcat input1 lecture.html logo.gif lshome lshome2 outline printarg1 setexample simple sortword specialvar testfor testfor2 testforone tfo2 whatsmyname wmn2
    $ echo lshome*
    lshome lshome2
    $ 
    
    The question mark is a wild card that will match any one character. Thus, ``echo ??'' will print out all filenames in the current directory that are composed of two characters, and ``echo lshome?'' will print out ``lshome2'', since the question mark must match one character:
    $ echo ??
    f1 f2
    $ echo lshome?
    lshome2
    $
    

    Single and double quotes

    Quotes allow you to:
    1. Use special characters in strings
    2. Bundle up multiple words into one argument
    Single quotes allow you to use most any character in a string. For example, you can use *, ?, (, ), >, <, |, ", $, & and space in single quotes without having the shell do anything special to them:
    $ echo Do you have $100 (so I can borrow it?)
    syntax error: `(' unexpected
    $ echo 'Do you have $100 (so I can borrow it?)'
    Do you have $100 (so I can borrow it?)
    $ echo 'Hey   'Jim'!'
    Hey   Jim!
    $ echo 'Hey   ' Jim'!'
    Hey    Jim!
    $ echo 'Hey   '               Jim'!'
    Hey    Jim!
    
    Double quotes are less powerful. They work like single quotes except $ gets expanded (see below). You can build strings simply by concatenating them as above. To get a single quote in a string, you need to use double quotes, and to get a double quote in a string, you need to use single quotes. For example, to get the string "'", you do the following:
    $ echo '"'"'"'"'
    "'"
    $ echo 'She said "This is Jim'"'s course!!"'"'
    She said "This is Jim's course!!"
    

    Differences between csh and the Bourne shell

    There are many differences between the Bourne shell and csh. Some major ones are that the Bourne shell has no command aliasing, no history, no '~' expansion, no 'setenv', and slightly different syntax for redirection (for example, there is no ``2>&1''. Instead, if you want standard output and standard error to go to the same place, you use ``>&''.

    There are other major differences in expression syntax, etc. Since I don't ever program with csh, I don't know the differences. However, in general, you can't use a csh on a Bourne shell script.

    Fortunately, the use of single and double quotes is nearly identical in both the Bourne shell and csh.


    Environment variables

    Environment variables are a simple associative matching between names and strings. The Bourne shell will inherit the environment variables in its calling environment (i.e. any setenv's that you have done in the csh), and it lets you set your own.

    Environment variables are expanded by using the dollar sign. For example, your home directory and user name are always in the environment variables HOME and USER respectively:

    $ echo $HOME
    /mahogany/homes/plank
    $ echo $USER
    plank
    $
    
    As stated above, environment variables are expanded in double quotes, but are not in single quotes. You can use quotes to build strings out of environment variables.
    $ echo "$HOME"
    /mahogany/homes/plank
    $ echo '$HOME'
    $HOME
    $ echo $HOME$USER
    /mahogany/homes/plankplank
    $ echo $HOME $USER
    /mahogany/homes/plank plank
    $ echo "$HOME  "bigjim$USER
    /mahogany/homes/plank  bigjimplank
    $ echo "$USER's home directory is $HOME"
    plank's home directory is /mahogany/homes/plank
    $
    
    When you're running a shell script, you can get at the command line arguments using $1, $2, $3, up to $9. For example, printarg1 prints out the first command line argument:
    UNIX> printarg1 1
    1
    UNIX> printarg1 Jim Plank
    Jim
    UNIX> printarg1 "Jim Plank"
    Jim Plank
    UNIX>
    
    When programming with the Bourne shell, you often have to be careful to use double quotes whenever you may get a space in a string. For example, look at doubleprintarg1. Make sure you understand why the output of doubleprintarg1 below is as it is:
    UNIX> doubleprintarg1 "Jim Plank"
    Jim
    Jim Plank
    UNIX>
    
    To set an environment variable, you do ``var=string''. Note that there should be no space between the equals sign and the var and string. In setting the environment variable, the only way that you can use spaces in the string is to enclose it in quotes. Examples:
    $ a=Jim
    $ b=Plank
    $ echo $a $b
    Jim Plank
    $ c="Jim Plank"
    $ echo $c
    Jim Plank
    $ d=$a $b
    Plank: not found
    $ d="$a $b"
    $ echo $d
    Jim Plank
    $ d=$d$d
    $ echo $d
    Jim PlankJim Plank
    $
    

    Special environment variables: $#, $$, $! $*

    There are a few special environment variables: Look at specialvar, and make sure you understand the output when executed with the following:
    UNIX> specialvar
    Number of arguments: 0
    Process id: 4699
    Arguments: 
    Forked process pid: 4700
    UNIX> specialvar a1 a2 a3
    Number of arguments: 3
    Process id: 4701
    Arguments: a1 a2 a3
    Forked process pid: 4702
    UNIX> specialvar GIVE     HIM    SIX
    Number of arguments: 3
    Process id: 4705
    Arguments: GIVE HIM SIX
    Forked process pid: 4706
    UNIX>
    

    If statements

    The syntax of if statements is:
    if bool
      then
         statements
    fi
    
    Usually, you use a semicolon and put the then on the same line:
    if bool ; then
         statements
    fi
    
    You can have an else clause, and any number of elif clauses. Now, what is the boolean statement? Unix processes return a number to their caller (for those who have taken CS360, this is exit value returned by the wait() system call). The boolean statement is a Unix command, and if it returns zero, then the then part of the clause is executed. Otherwise, the next elif or else clause is executed.

    For example, cat returns 0 if it runs successfully, and 1 if it encounters an error. Look at ifcat:

    UNIX> cat ifcat
    #!/bin/sh
    
    if cat $1 > /dev/null 2>&1 ; then 
      echo "cat $1 worked just fine"
    else
      echo "cat $1 returned with an error"
    fi
    UNIX>
    
    This executes cat on the argument, and uses the exit value of cat to report whether it was successful. Try it out:
    UNIX> ifcat f1
    cat f1 worked just fine
    UNIX> ifcat /usr/dict/words             
    cat /usr/dict/words worked just fine
    UNIX> ifcat no-such-file
    cat no-such-file returned with an error
    UNIX> 
    

    The Test Program

    There is a program called test whose purpose is to evaluate boolean functions. Do man test to learn the complete syntax. I'll go over a few: As an example, whatsmyname is a simple shell script that has you guess my name on the command line:
    UNIX> whatsmyname Jim
    Right!
    UNIX> whatsmyname James
    Right, although I prefer to be called Jim
    UNIX> whatsmyname Peyton
    Nope
    UNIX> whatsmyname Frank
    "Frank Plank" -- Are you kidding?!?!?!?!
    UNIX>
    
    To improve readability, the Bourne shell lets you enclose your arguments to test in square brackets. Then if statements look much better. wmn2 is pretty much just like whatsmyname, except that it uses the square brackets, and handles some conditions better than whatsmyname:
    UNIX> whatsmyname
    whatsmyname: test: argument expected
    UNIX> wmn2
    usage: wmn2 name
    UNIX> wmn2 Jim
    Right!
    UNIX> whatsmyname Jim Plank
    Right!
    UNIX> wmn2 Jim Plank
    usage: wmn2 name
    UNIX> whatsmyname "Jim Plank"
    whatsmyname: test: unknown operator Plank
    UNIX> wmn2 "Jim Plank"
    Nope
    UNIX> 
    

    While

    While's syntax is similar to if's:
    while bool ; do
         statements
    done
    

    Shift

    Shift is a simple command that shifts the command line arguments by one. The $# and $* variables are changed as well. For example, testforone uses shift in a while loop to test if any of its command line arguments equal one. Note that the numerical value of a string is its atoi() value. This is an integer, not a floating point number.
    UNIX> testforone 
    UNIX> testforone 3 2 1 1.0 1.5 "1 Jim" "Jim Plank"
    No:  3 does not equal 1
    No:  2 does not equal 1
    Yes: 1 equals 1
    Yes: 1.0 equals 1
    Yes: 1.5 equals 1
    Yes: 1 Jim equals 1
    No:  Jim Plank does not equal 1
    UNIX>
    

    Set

    The set command lets you set the command line arguments to something else. This can be very convenient. As an example, look at setexample. It sets the command line arguments and then prints them out one by one with a while/shift loop. Note the use of quotes in "Rocky Top".
    UNIX> setexample
    Once I Had a Girl on Rocky Top
    Once
    I
    Had
    a
    Girl
    on
    Rocky Top
    UNIX> setexample XXX OOO
    Once I Had a Girl on Rocky Top
    Once
    I
    Had
    a
    Girl
    on
    Rocky Top
    UNIX> 
    

    For

    You can do for loops in one of two ways:
    for var do
      statements
    done
    
    or
    for var in strings ; do
      statements
    done
    
    The former way loops through all command line arguments, each time setting the var to be the argument. The latter way loops through each string, each time setting the var to be the string.

    For example, tfo2 uses the first kind of for loop to implement a program equivalent to the testforone program:

    UNIX> tfo2
    UNIX> tfo2 3 2 1 1.0 1.5 "1 Jim" "Jim Plank"
    No:  3 does not equal 1
    No:  2 does not equal 1
    Yes: 1 equals 1
    Yes: 1.0 equals 1
    Yes: 1.5 equals 1
    Yes: 1 Jim equals 1
    No:  Jim Plank does not equal 1
    UNIX> 
    
    For example, testfor shows a very simple example of the second kind of for loop:
    UNIX> testfor
    Once
    I
    had
    a
    girl
    on
    Rocky Top
    UNIX> 
    

    Line Continuations

    You can end a line with a backslash to continue it to the next line, as in testfor2.

    Backquotes

    Backquotes are very important in the Bourne Shell. They execute the command in the backquotes, and then treat the output like a list of strings. If the output is on multiple lines, it is treated as one big list of strings with no newlines. For example, bq1 takes a file as its argument and prints out the content of the file all on one line:
    UNIX> bq1 f1
    This is f1
    UNIX> bq1 bq1
    #!/bin/sh if [ $# -ne 1 -o ! -f "$1" ]; then echo "usage: bq1 filename" >& 2 exit 1 fi b=`cat "$1"` echo $b
    UNIX> 
    
    And bq2 takes a file as its argument and prints out each word on its own line:
    UNIX> bq2 f1
    This
    is
    f1
    UNIX> bq2 input1
    Once
    I
    had
    a
    girl
    on
    Rocky
    Top
    Half
    bear
    the
    other
    have
    cat
    Mean
    as
    a
    snake
    but
    sweet
    as
    soda
    pop
    I
    still
    think
    about
    that
    
    You can use parentheses around larger blocks of code to get some nice effects. For example, sortword tweaks bq2 to sort the words in a file. How long would it have taken you to write that in C?
    UNIX> sortword input1
    Half
    I
    Mean
    Once
    Rocky
    Top
    a
    about
    as
    bear
    but
    cat
    girl
    had
    have
    on
    other
    pop
    snake
    soda
    still
    sweet
    that
    the
    think
    UNIX>
    

    bc

    bc is a simple infix calculator that often gets used to do induction variables in sh. For example, inc prints out the command line argument plus one using bc:
    UNIX> inc 4
    5
    UNIX>
    
    Note that using bc in loops is pretty slow (for example, the square program that you'll write as part of your lab is much, much slower than a C version would be). That's because each time you call bc and test you're forking off a new process. This is expensive compared to doing everything in a C program. That's why it's best to think of shell scripts as something that you write when efficiency of writing the program is more important than efficiency of the program itself.

    expr is another program that lets you do math in shell scripts. It's nicer than bc because it lets you specify the arguments on the command line, and it performs math, logical arithmetic, and string manipulation. However, it does not do floating point arithmetic. Read the man page for more information. For example, einc is just like inc except it uses expr instead of bc. You'll notice that you cannot increment decimal numbers with einc, but you can with inc.


    Other stuff

    You can read about other stuff in the man page for sh. Other things that you may want to know about are: