For years, that philosophy served me well. I was able to write shell scripts that worked on any flavor of Unix, because they all had the Bourne Shell in /bin/sh. I would use whatever default shell was on my machine (csh, tcsh, ztcsh, bash, etc) for interactivity, and it typically had whatever features I needed. However, when it came to scripts, I could trust that the Bourne Shell was there and would run my shell scripts correctly.
Around 2010, the Bourne shell started to disappear. Instead, /bin/sh was hard-linked to a backward compatible shell, typically Bash. I didn't notice this, because my shell scripts were all Bourne shell compatible, so they would work on any shell that was an extension of the Bourne shell. As life marched on, different system administrators linked /bin/sh to different backward compatible shells, to the point that when you run /bin/sh, you don't really know what shell you're getting. So long as you write nice Bourne shell scripts, you're good (kind of), but what has become of learning to write Bourne shell scripts? That has become increasingly difficult, because it's hard to know when you are writing Bourne shell compatible code.
Why am I prattling on about this? Because the current shell balkanization is a disaster for portability. If you write bash scripts, without knowing whether your script is a Bourne shell script, then you don't really have any assurances that your script will run on a different Bourne shell-compatible shell. And when the GNU foundation decides to change bash yet again, or when Apple decides to hard link /bin/sh to some other shell, all your scripts will break...
The script scripts/basic.sh shows some shell basics. I don't bother starting it with "#!/bin/sh", because I gave it the ".sh" extension. Go ahead and read the comments.
# This is a comment i=Fred # When you do variable assignment, you can't have any spaces. Everything is a string. echo $i # You access a variable with the dollars sign. # Echo prints its command line arguments to standard output. j=Fred$i$i # You can concatenate strings when there's nothing special about them (like spaces). echo $j echo $j i $i Jim Plank # Everything's a string unless it uses special characters like $. |
When we run it, everything is straightforward:
UNIX> sh scripts/basic.sh Fred FredFredFred FredFredFred i Fred Jim Plank UNIX>When strings, and the mixture of strings and variables becomes more complex, you need to use quotes to help the shell figure out what you mean. You can use single quotes or double quotes to help you delineate strings and variables, and to put things like spaces and semi-colon into your strings. Rules of thumb with single and double quotes are:
# These three lines show how you put a space, and quotes into a string: i="Jim Plank" j="It's Cold Outside" k='He said, "Hey!"' echo $i echo $j echo $k echo "" # Output a blank line. # You expand variables inside double quotes, but put dollar signs in single quotes. echo "$i" echo '$i' echo "" # Mixing single and double quotes in a string is a pain echo 'But she said, "You aren'"'t being polite!!"'"' # Sometimes it looks a little less confusing to use variables to clean it up: dq='"' yabp="You aren't being polite!!" echo But she said, $dq$yabp$dq |
Here's the output:
UNIX> sh scripts/quotes.sh Jim Plank It's Cold Outside He said, "Hey!" Jim Plank $i But she said, "You aren't being polite!!" But she said, "You aren't being polite!!" UNIX>
I assume you're familiar with pipes, too. What you may not be familiar with are the standard Unix programs that are so powerful when you use pipes. These are:
Here's an example -- suppose I want to reverse the lines of the file in files/input-1.txt. Here we go -- I'll demonstrate command by command:
UNIX> cat files/input-1.txt # Here's the file Emma Fur Cameron Transliterate Kate Clasp Olivia Digit Nathan Martinique UNIX> cat -n files/input-1.txt # Prepend with the line number 1 Emma Fur 2 Cameron Transliterate 3 Kate Clasp 4 Olivia Digit 5 Nathan Martinique UNIX> cat -n files/input-1.txt | sort -nr # Sort in reverse order, by number (that's -n) 5 Nathan Martinique 4 Olivia Digit 3 Kate Clasp 2 Cameron Transliterate 1 Emma Fur UNIX> cat -n files/input-1.txt | sort -nr | sed 's/.......//' # Strip out the initial 7 characters on each line Nathan Martinique Olivia Digit Kate Clasp Cameron Transliterate Emma Fur UNIX>Inside a shell script, if you end a line with '|' or '\', then it will continue the command on the next line. Otherwise, commands are executed line by line.
In and of itself, this is not very exciting. However, it becomes more powerful when you combine it with parentheses. With parentheses, you combine multiple commands and execute them in a sub-shell. For example, suppose I have two files:
files/program-1-timings.txt0.19953 0.24622 0.24946 0.29696 0.31996 0.33349 0.36809 0.65309 0.73067 0.75717 |
files/program-2-timings.txt0.03814 0.21831 0.26013 0.29111 0.64541 0.67802 0.70485 0.79115 0.91172 0.96959 |
Suppose I want to create a new file that has two words per line -- line 1 has line 1's from both files, then line 2 has line 2's from both files, etc. Try on scripts/merge_p12.sh for size:
( sed 's/^/A /' files/program-1-timings.txt | cat -n ; # put line-number-tab-A in front of each line sed 's/^/B /' files/program-2-timings.txt | cat -n ) | # put line-number-tab-B in front of each line sort | # Sorting will do 1-tab-A, 1-tab-B, 2-tab-A, 2-tab-B, etc. awk '{ printf " %s", $3 ; if ($2 == "B") print "" }' # Print each word, and then a newline after every second word. |
We execute the first two lines in a subshell so that both sets of commands go to standard output, and are sorted by the sort command. We then use awk to print the third word on each line, and then a newline on the lines whose second words are "B". Powerful, no?
UNIX> sh scripts/merge_p12.sh 0.19953 0.03814 0.24622 0.21831 0.24946 0.26013 0.29696 0.29111 0.31996 0.64541 0.33349 0.67802 0.36809 0.70485 0.65309 0.79115 0.73067 0.91172 0.75717 0.96959 UNIX> |
UNIX> ls files index.html scripts UNIX> echo * files index.html scripts UNIX>The question mark is a wild card that will match any one character.
UNIX> echo $PWD /home/jplank sh-4.2$ cd bin UNIX> echo $PWD /home/jplank/bin UNIX> echo $HOME /home/jplank UNIX> echo $USER jplank UNIX>If you set a variable in a shell script, and you then export it, then it will be an environment variable in any program that you launch from the shell. Here's an example, of launching a program that accesses the variable i without exporting it first, and then again when we do export it. The program is in scripts/set_environment.sh:
echo "Setting i to 5, and then launching a second shell that prints $i. It won't print anything." i=5 echo 'echo $i' | sh echo "If we export i, then the second shell will print i." export i echo 'echo $i' | sh |
When we run it, we see that the second launch has i in its environment:
UNIX> sh scripts/set_environment.sh Setting i to 5, and then launching a second shell that prints . It won't print anything. If we export i, then the second shell will print i. 5 UNIX>
UNIX> cat files/input-1.txt Emma Fur Cameron Transliterate Kate Clasp Olivia Digit Nathan Martinique UNIX> echo `cat files/input-1.txt` Emma Fur Cameron Transliterate Kate Clasp Olivia Digit Nathan Martinique UNIX> c=`cat files/input-1.txt` UNIX> echo $c Emma Fur Cameron Transliterate Kate Clasp Olivia Digit Nathan Martinique UNIX> b=`grep n files/input-1.txt | wc | awk '{ print $1 }'` # How many lines contain the character n? UNIX> echo $b 2 UNIX>
# Arithmetic on its own echo $((1+1)) # Arithmetic with environment variables echo "" b=5 c=10 b=$(($b*$c)) echo $b # Floating point doesn't work. Sorry -- you have to use awk: echo "" b=5.5 c=10.9 echo $b $c | awk '{ print $1 - $2 }' |
UNIX> sh scripts/arithmetic.sh 2 50 -5.4 UNIX>
The syntax of if is as follows:
if program-whose-exit-code-will-be-evaluated then stuff1 else stuff2 fi |
For example, I have a program in src/exitcode.cpp, which simply reads an integer on standard input, and then returns it. We'll use it to test if:
UNIX> if echo 0 | bin/exitcode > then > echo "True" > else > echo "False" > fi True UNIX> if echo 1 | bin/exitcode; then echo "True"; else echo "False"; fi False UNIX>As you can see, in the second command, I used semi-colons judiciously to put it all on one line.
When you enclose an expression in square brackets, it executes the test program on it. Read the man page on test -- it's got a lot of convenient features, and can be used on integers, strings, and files/directories:
UNIX> if [ -f index.html ]; then echo "index.html exists"; fi index.html exists UNIX> if [ `ls | wc | awk '{ print $1 }'` -gt 4 ]; then echo "More than 4 files"; fi More than 4 files UNIX>You can do simple loops with while -- you end it with done:
UNIX> b=0 UNIX> while [ $b -lt 5 ]; do > echo b is $b > b=$(($b+1)) > done b is 0 b is 1 b is 2 b is 3 b is 4 UNIX>And you use for to run through space-separated lists of things:
UNIX> b=`ls` UNIX> echo $b bin files index.html makefile scripts src UNIX> for i in $b ; do > echo $i > done bin files index.html makefile scripts src UNIX>
n=$# echo "# Command line arguments: $n" echo "Arg 1 is $1" echo "The whole command line is $*" |
UNIX> sh scripts/command-line.sh 0 1 2 3 4 5 # Command line arguments: 6 Arg 1 is 0 The whole command line is 0 1 2 3 4 5 UNIX> sh scripts/command-line.sh # Command line arguments: 0 Arg 1 is The whole command line is UNIX> sh scripts/command-line.sh "Are you ready" Freddie'?' # Command line arguments: 2 Arg 1 is Are you ready # The double quotes put the spaces into a single argument. The whole command line is Are you ready Freddie? UNIX>You can use the shift command to shift the command line arguments by one -- this deletes argument 1, and then turnes argument 2 into 1, 3 into 2, etc. That also changes $* and $#. Here is scripts/shift.sh:
while [ $# -gt 0 ]; do echo "Number of arguments: $#. Argument 1 is $1." shift done |
UNIX> sh scripts/shift.sh 0 1 2 Number of arguments: 3. Argument 1 is 0. Number of arguments: 2. Argument 1 is 1. Number of arguments: 1. Argument 1 is 2. UNIX> sh scripts/shift.sh One, Two, "Buckle My Shoe" Number of arguments: 3. Argument 1 is One,. Number of arguments: 2. Argument 1 is Two,. Number of arguments: 1. Argument 1 is Buckle My Shoe. UNIX> sh scripts/shift.sh `echo One, Two, "Buckle My Shoe"` Number of arguments: 5. Argument 1 is One,. Number of arguments: 4. Argument 1 is Two,. Number of arguments: 3. Argument 1 is Buckle. Number of arguments: 2. Argument 1 is My. Number of arguments: 1. Argument 1 is Shoe. UNIX>Finally, you can use set to "set" the command line, and if you don't specify "in" in your for loops, it will traverse the command line. Here is scripts/for_set.sh:
a=1 # Traverse the command line printing arguments for i do echo "Argument $a: $i" a=$(($a+1)) done set `ls` # Traverse the output of ls, printing files a=1 for i do echo "File $a: $i" a=$(($a+1)) done |
UNIX> sh scripts/for_set.sh 0 1 2 Argument 1: 0 Argument 2: 1 Argument 3: 2 File 1: bin File 2: files File 3: index.html File 4: makefile File 5: scripts File 6: src UNIX>
while read a ; do echo $a done |
UNIX> echo 1 2 3 4 | sh scripts/read.sh # It reads all of the line into a 1 2 3 4 UNIX> ( echo 1 ; echo 2 ; echo 3 ; echo 4 ) | sh scripts/read.sh # This does one at a time. 1 2 3 4 UNIX> echo fred | sh scripts/read.sh fred UNIX>Finally, using '<< EOF' allows you to put standard input to a command into your shell script. It will stop when it sees EOF. I use this most often when I have to use ed to do fancier editing than I can do with sed. Here's an example in scripts/duplicate.sh. You can read it to see what it does:
# Error check that you have two arguments if [ $# -ne 2 ]; then echo 'usage: duplicate f1 f2 - makes two copies of f1 into f2 using ed' >&2 exit 1 fi # Error check that the first file exists. # Note the use of double-quotes in case the filename has a space in it. if [ ! -f "$1" ]; then echo "$1 doesn't exist" exit 1 fi # Copy the first file to the second, then use ed to duplicate all of the lines. # The "EOF" ends the input to ed. You need to backslash the dollars signs, because # the shell will attempt to do variable substitution if you don't. cp "$1" "$2" ed -s "$2" << EOF # -s makes ed not print out diagnostics 1,\$t\$ w q EOF |
Here we call it on files/input-1.txt:
UNIX> cat files/input-1.txt Emma Fur Cameron Transliterate Kate Clasp Olivia Digit Nathan Martinique UNIX> sh scripts/duplicate.sh usage: duplicate f1 f2 - makes two copies of f1 into f2 using ed UNIX> sh scripts/duplicate.sh files/input-1.txt tmp.txt UNIX> cat tmp.txt Emma Fur Cameron Transliterate Kate Clasp Olivia Digit Nathan Martinique Emma Fur Cameron Transliterate Kate Clasp Olivia Digit Nathan Martinique UNIX> rm tmp.txt UNIX>Here's a second example that illustrates putting a variable inside the ed program. The script is in scripts/substitute.sh:
# Error check that you have three arguments if [ $# -ne 3 ]; then echo 'usage: substitute p1 p2 f - substitutes all instances of p1 with p2 in the file f using ed' >&2 exit 1 fi # Error check that the file exists. # Note the use of double-quotes in case the filename has a space in it. if [ ! -f "$3" ]; then echo "$3 doesn't exist" exit 1 fi # Use ed to do the pattern substitution with the environment variables $1 and $2 ed -s "$3" << EOF 1,\$s/$1/$2/g w q EOF |
As you can see, you put a backslash in front of '$' when you want an actual '$'. Otherwise, it treats the '$' as denoting a variable (or in this case, command line arguments).
Here it is in use:
UNIX> cat files/input-2.txt Victoria Bedbug PhD Victoria Incredible Victoria Abigail Expire Victoria Childish Elizabeth Victoria Lookup Victoria Moo UNIX> sh scripts/substitute.sh usage: substitute p1 p2 f - substitutes all instances of p1 with p2 in the file f using ed UNIX> cp files/input-2.txt tmp UNIX> sh scripts/substitute.sh Victoria Peach tmp # Substitute "Peach" for "Victoria" UNIX> cat tmp Peach Bedbug PhD Peach Incredible Peach Abigail Expire Peach Childish Elizabeth Peach Lookup Peach Moo UNIX> sh scripts/substitute.sh '.*' Fred tmp # Change every line to "Fred" UNIX> cat tmp Fred Fred Fred Fred Fred Fred UNIX>