CS360 Lecture notes -- Shell Redirection


This set of lecture notes has some forward references to things that you will learn later in the semester, such as dup2(). For now, you can ignore it when you see it. You may want to come back later in the semster to reinforce your knowledge of this.
There are many different shells that people use under Unix. The lecture focuses on the Bourne Shell, with some references to the C shell. The "Bash" shell is derived from the Bourne Shell, so your Bourne Shell scripts will work on bash. The converse is not true, since bash supports an ever evolving collection of crap. Such is life. The examples below will be on a bash shell.

There are lecture notes for writing shell scripts with the Bourne shell from my CS494/594 class: http://web.eecs.utk.edu/~jplank/plank/classes/cs494/494/notes/New-Sh/index.html.

What we are concerned with here are the redirection primitives. Many of these are simple, and are the same in pretty much all shells.

So, for example, suppose the file f1 contains the bytes ``This is f1''. The following redirections should not confuse you. In each case, ask yourself what should the output of the command should be:
UNIX> cat f1
This is f1
UNIX> cat < f1
This is f1
UNIX> < f1 cat      You can put the redirection anywhere in the command line.
This is f1
UNIX> < cat f1      This is the same as f1 < cat - it can't find the file "cat".
cat: No such file or directory.
UNIX> cat f1 > f2
UNIX> cat f2
This is f1
UNIX> cat f1 >> f2
UNIX> cat f2
This is f1
This is f1
UNIX> > f2 < f1 cat     This is the same as cat < f1 > f2
UNIX> cat f2
This is f1
UNIX> 
Now, suppose there is no file f3. When we say cat f1 f3, it will print the contents of f1 to standard output and an error message to standard error. Typically, both of these go to the screen:
UNIX> cat f1 f3
This is f1
cat: f3: No such file or directory
UNIX> 
However, if you redirect standard output to a file, then f1 will go to the file, and the error message will go to the screen. Why? Because the shell is calling dup2(fd, 1) to the output file, but nothing for file descriptor 2:
UNIX> cat f1 f3 > f2
cat: f3: No such file or directory
UNIX> cat f2
This is f1
UNIX> 
With the C shell, you can redirect both standard output and standard error to the same file by using >& :
UNIX> csh -c 'cat f1 f3 >& f2'
UNIX> cat f2
This is f1
cat: f3: No such file or directory
UNIX> 
The Bourne shell has different primitives for dealing with standard output and standard error. Whenever you say x>, it will redirect file descriptor x. For example, another way to redirect standard output to a file under the Bourne shell is:
UNIX> cat f1 f3 1>f2
cat: cannot open f3
UNIX> cat f2
This is f1
UNIX> 
And we can redirect standard output and standard error to different files very easily:
UNIX> cat f1 f3 1>f2 2>f5
UNIX> cat f2
This is f1
UNIX> cat f5
cat: cannot open f3
UNIX> 
The shell processes these statments left to right, so I can do multiple redirections of standard output, and not only will the shell allow it without complaint, the shell will create all of the files that you specify:
UNIX> rm f2
UNIX> cat f1 f3 1>f2 1>f5
cat: cannot open f3
UNIX> cat f2
UNIX> cat f5
This is f1
UNIX> 
As you can see, f2 was created and is empty. This is because it was opened for writing in the first redirection, and then closed in the second redirection statement.

Can we redirect standard input multiple times? It depends on your shell:

UNIX> echo "This is f1" > f1
UNIX> echo "This is f2" > f2
UNIX> sh -c "cat < f1 < f2"
This is f2
UNIX> sh -c "cat < f2 < f1"
This is f1
UNIX> csh -c "cat < f1 < f2"
Ambiguous input redirect.
UNIX> 

Look at what happens when try to have input and output come from the same file:

UNIX> cat f2
This is f2
UNIX> head f2 > f2
UNIX> cat f2
UNIX> 
The shell does its redirection before executing head. That means that f2 is truncated before head is called, and when head is called, f2 is empty. Therefore, head exits, and f2 remains empty.

The same thing happens if you redirect head's standard input to the standard output:

UNIX> echo "This is f2" > f2
UNIX> head < f2 > f2
UNIX> cat f2
UNIX> 

Now consider x>y again. If you specify y as &y, then it will make sure that file descriptor x in the program is identical to file descriptor y. Another way of saying this is that: "Whatever file descriptor y is currently going to, now file descriptor x is going there too, and the two are identical." (You will learn later how this works with the dup2() system call). Thus, look at the following:

UNIX> cat f1 f3 > f2 2>&1
UNIX> cat f2
This is f1
cat: cannot open f3
UNIX>
What is going on? First, you redirect standard output to f2. That means file descriptor 1 is going to f2. Then the 2>&1 part says to make file descriptor 2 identical to file descriptor 1. This means that standard error will go to f2 as well.

Again, these are processed by the shell from left to right. Suppose you reverse the order of the statements:

UNIX> cat f1 f3 2>&1 > f2
cat: cannot open f3
UNIX> cat f2
This is f1
UNIX> 
Now, the 2>&1 part says to make file descriptor 2 identical to file descriptor 1, which at the time the shell sees this command, is going to the screen. Next, it redirects file descriptor 1 to f2. So, standard error goes to the screen, and standard output goes to f2.

Look at:

UNIX> cat f1 f3 >f2 2>&1 1>f5
UNIX> cat f2
cat: cannot open f3
UNIX> cat f5
This is f1
UNIX> 
Now, standard output first goes to f2, then the 2>&1 part makes standard error identical to standard output. In other words, both are going to f2. Then the 1>f5 part makes standard output to to f5. Therefore, that line is equivalent to: ``cat f1 f3 >f5 2>f2.''

You can make use of other file descriptors if you want:

UNIX> cat f1 f3 3>f2 1>f5 2>&1 1>&3
UNIX> cat f2
This is f1
UNIX> cat f5
cat: cannot open f3
UNIX> 
Figure that one out for yourself.

You can use this technique to do some pretty ugly stuff. Look at src/badbadcode.c:

/* This is a program that assumes file descriptor 3 is open, and writes to it. */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main()
{
  char *s = "Hi!\n";
  int i;
  int fd;

  i = write(3, s, strlen(s));
  printf("%d\n", i);
  if (i < 0) perror("write");
  return 0;
}

And now check the following out. Is this a good thing or a bad thing?

UNIX> bin/badbadcode
-1
write: Bad file number
UNIX> bin/badbadcode 3>f5
4
UNIX> cat f5
Hi!
UNIX> bin/badbadcode 3>&1
Hi!
4
UNIX> 
I'd say it's a good thing, as long as you don't tell anyone about it. In fact, I've done the following to help me debug: Suppose you have a subtle bug in a fairly large piece of code. And you'd like to create some output to help you, but you've already junked up standard output so much that you can't use it. Worse, you know that the bug is nested in a bazillion procedure calls, and you don't want to worry about the flow of control getting there or passing FILE *'s to all of those procedure calls.

Then, you do something like src/dont_admit_i_taught_you_this.c:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

void v(int i) 
{ 
  char s[100];

  sprintf(s, "Here's my error message.  V was called with %d\n", i);
  write(9, s, strlen(s));
}

void u() { printf("Blah blah blah blah blah blah.\n");  v(4); }
void t() { printf("Blah blah blah blah blah blah.\n");  u(); }
void s() { printf("Blah blah blah blah blah blah.\n");  u(); t(); }
void r() { printf("Blah blah blah blah blah blah.\n");  u(); s(); }
void q() { printf("Blah blah blah blah blah blah.\n");  u(); r(); }
void p() { printf("Blah blah blah blah blah blah.\n");  q(); q(); v(5); }
void o() { printf("Blah blah blah blah blah blah.\n");  p(); }
void n() { printf("Blah blah blah blah blah blah.\n");  o(); }
void m() { printf("Blah blah blah blah blah blah.\n");  n(); }
void l() { printf("Blah blah blah blah blah blah.\n");  m(); }
void k() { printf("Blah blah blah blah blah blah.\n");  l(); m(); v(6); }
void j() { printf("Blah blah blah blah blah blah.\n");  k(); }
void i() { printf("Blah blah blah blah blah blah.\n");  j(); }
void h() { printf("Blah blah blah blah blah blah.\n");  i(); }
void g() { printf("Blah blah blah blah blah blah.\n");  h(); }
void f() { printf("Blah blah blah blah blah blah.\n");  g(); }
void e() { printf("Blah blah blah blah blah blah.\n");  f(); v(7); }
void d() { printf("Blah blah blah blah blah blah.\n");  e(); }
void c() { printf("Blah blah blah blah blah blah.\n");  d(); }
void b() { printf("Blah blah blah blah blah blah.\n");  c(); }
void a() { printf("Blah blah blah blah blah blah.\n");  b(); v(8); }


int main()
{
  a();
  return 0;
}

As you can see, v() is getting called a lot. Each time it gets called, I'm writing a string into file descriptor 9. What's file descriptor 9? Well, you can determine that with the shell:

UNIX> bin/dont_admit_i_taught_you_this > /dev/null 9>txt/elog.txt
UNIX> cat txt/elog.txt
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 5
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 4
Here's my error message.  V was called with 5
Here's my error message.  V was called with 6
Here's my error message.  V was called with 7
Here's my error message.  V was called with 8
UNIX> bin/dont_admit_i_taught_you_this > /dev/null
UNIX> 
In that last call, I didn't redirect file descriptor 9. Therefore, the write() statement failed and returned -1. It's like nothing happened.

Remember that trick -- it may come in handy sometime.


(Often, at this point in the lecture, I have time leftover, so I go over some pointer exam questions. Good ones are questions 3 and 5 from 2017 Midterm 1).