CS360 Jshell Lab

What you turn in

You should submit the program jshell.c. The TA's will compile this with libfdr.

Introduction

Time to write a shell. Your job is to write jshell -- your own shell, which is going to be very primitive. Jsh is like csh / sh / bash: It is a command line interpreter that lets you execute commands and redirects their input/output.

The syntax is not like "regular" shells like sh, bash and csh. It is more of a pain to use, but easier to write!

The command format

jshell reads lines of text from standard input. Commands are composed of multiple lines. Here's how to interpret each line (use the fields library):

Blank line, or a line that begins with a '#': Ignore
A line whose first word is "<" -- this should have a single word after it, which you will interpret as a filename. You will redirect standard input from that file to the first child process of the command.
A line whose first word is ">" -- this should have a single word after it, which you will interpret as a filename. You will redirect standard output of the last child process in the command, to that file.
A line whose first word is ">>" -- this should have a single word after it, which you will interpret as a filename. You will redirect standard output of the last child process in the command, and append to that file.
A line composed of the word "NOWAIT". This means that when you run the command, you will not wait for any of the child processes to exit.
A line composed of the word "END". This means that the command is over and you should go about executing it.
Any other line is interpreted as an argv array for a child process in the command. You can specify any number of these, and when you execute the command, each child process should be connected to the next via pipes -- standard output of child i goes to standard input of child i+1.
These lines can be specified in any order; however, when you read "END", you execute the command that you have been reading.

The first command line argument of my jshell

My jshell allows you to specify some letters on its first command line argument:

'r' -- a "READY" prompt will be printed at the beginning of the program, and after each "END" (after the commands corresponding to that "END" have been executed).
'p' -- Before executing the command, this will print out what the command is.
'n' -- The command will not be executed. Combine this with 'p' to simply see what the commands are without executing them.

See the examples below for more information about how these are specified and used.

BTW, your shell does not have to implement these. You may want to, to help you with debugging, but I won't test them. Here are some examples:

UNIX> bin/jshell r       # The 'r' on command line means that it will
READY                    # print "READY" when it's ready to receive a command
cat f1.txt   
END                      # You need an "END" line to make it execute the command
Andrew Sundry
Brandon Aperiodic
Gianna Coralberry
Sydney Roundoff
Brandon Canvas
Julia Suffocate
Amelia Chantey
Isaiah Aidan Plait
Lucy Clamp
Arianna Infant
READY
             # I'm putting an extra line after the READY's to make it easier to read.

< f1.txt     # Here, we redirect standard input from f1.txt
head -n 2
> f2.txt     # And standard output to f2.txt
END
READY

cat f2.txt   # Here's f2.txt
END
Andrew Sundry
Brandon Aperiodic
READY

> f2.txt     # It doesn't matter what order you specify the
< f1.txt     # redirections with respect to the commands.
head -n 2
END
READY

cat f2.txt
END
Andrew Sundry
Brandon Aperiodic
READY

head -n 2 f1.txt    # Test appending to a file
>> f2.txt
END
READY

cat f2.txt
END
Andrew Sundry
Brandon Aperiodic
Andrew Sundry
Brandon Aperiodic
READY

cat -n             # Here's where we pipe together three commands
sed s/[a-z]/x/g
tail -n 2
< f1.txt
END
     9	Lxxx Cxxxx
    10	Axxxxxx Ixxxxx
READY

cat sleep_fred.c   # sleep_fred.c sleeps 10 seconds and then 
END                # prints Fred on standard output.
#include 
#include 
#include 

int main()
{
  sleep(10);
  printf("Fred\n");
  exit(0);
}
READY

gcc sleep_fred.c    # We compile it and run it
END
READY

a.out
END
Fred         # You have to wait 10 seconds for it to print Fred.
READY        # After 10 seconds, it prints Fred and you get the READY prompt.

a.out        # Now I run it, but speify NOWAIT
NOWAIT
END
READY        # I get the prompt back instantly, with no Fred.

cat f2.txt   # I call this command in under 10 seconds
END
Andrew Sundry
Brandon Aperiodic
READY

Fred         # And finally Fred appears.
     
a.out        # I call a.out > f2.txt, but don't wait.
> f2.txt
NOWAIT
END
READY

cat f2.txt   # Within 10 seconds, f2.txt has been opened and
END          # truncated, but nothing written yet.
READY

cat f2.txt   # (Wait 10 seconds): After 10 seconds, f2.txt contains "Fred".
END
Fred
READY

<CNTL-D>
UNIX>

When you call my jshell and include 'p' in the first argument, it will print information about the command:

UNIX> bin/jshell rp
READY
cat f1.txt
END
Stdin:   None                  # After each command, you can see my internal data structure.
Stdout:  None (Append=0)
N_Commands:  1
Wait:        1
  0: argc: 2   argv: cat f1.txt

Andrew Sundry
Brandon Aperiodic
Gianna Coralberry
Sydney Roundoff
Brandon Canvas
Julia Suffocate
Amelia Chantey
Isaiah Aidan Plait
Lucy Clamp
Arianna Infant
READY

cat -n
sed s/[a-z]/x/g
tail -n 2
< f1.txt
> f2.txt
END
Stdin:   f1.txt
Stdout:  f2.txt (Append=0)
N_Commands:  3
Wait:        1
  0: argc: 2   argv: cat -n
  1: argc: 2   argv: sed s/[a-z]/x/g
  2: argc: 3   argv: tail -n 2

READY

cat f2.txt
END
Stdin:   None
Stdout:  None (Append=0)
N_Commands:  1
Wait:        1
  0: argc: 2   argv: cat f2.txt

     9	Lxxx Cxxxx
    10	Axxxxxx Ixxxxx
READY

head -n 1 f1.txt 
>> f2.txt
END
Stdin:   None
Stdout:  f2.txt (Append=1)
N_Commands:  1
Wait:        1
  0: argc: 4   argv: head -n 1 f1.txt

READY

cat f2.txt
END
Stdin:   None
Stdout:  None (Append=0)
N_Commands:  1
Wait:        1
  0: argc: 2   argv: cat f2.txt

     9	Lxxx Cxxxx
    10	Axxxxxx Ixxxxx
Andrew Sundry
READY

NOWAIT  
a.out
END
Stdin:   None
Stdout:  None (Append=0)
N_Commands:  1
Wait:        0
  0: argc: 1   argv: a.out

READY
Fred            # This comes 10 seconds later
<CNTL-D>
UNIX>

You don't have to implement the first command line argument

The gradescripts do not call jshell with a command line argument, so you don't have to implement 'r', 'p' or 'n'. However, were I you, I would. They don't have to match mine, but they are useful for code development and debugging.

Some General Advice

Incremental Programming

I would program in the following order:

Executing one command.
Implementing NOWAIT
Redirecting stdin.
Redirecting stdout with ">".
Redirecting stdout with ">>".
Pipes

Remember -- program slowly and test, test, test. Think of things you should test, and then test them. Don't try to have the gradescripts be your debugger -- you'll be more efficient thinking of things to test and testing them on your own before you move to the gradescripts.

Flush before fork

Before you call fork, you should call fflush() on stdin, stdout and stderr. Trust me.

Zombies

You should try to minimize the number of zombie processes that will exist (this is in all parts). This is not to say that they can't exist for a little while, but not forever. When you call wait() for a shell command, it might return the pid of a zombie process, and not the process you thought would return. This is fine --- you just have to be able to deal with it. (i.e. consider the following sequence):

cat f1
> /dev/null
NOWAIT
END

vi lab3.c
END

You are going to call wait() to wait for the vi command to terminate, but it will return with the status of the zombie process from the cat call. This is all fine -- you just need to be aware that these things may happen, and that you may have to call wait() again to wait for vi to complete.

Open files

You must make sure that when you call execvp, that there are only three files open -- 0, 1, and 2. If there are others open, then you have a bug in your shell.

Also, when a command is done, and the shell prints out its prompt, then it should only have three files open -- 0, 1, and 2. Otherwise, you have forgotten to close a file descriptor or two and have a bug in your code. Check for this. My jshell never uses a file descriptor higher than 5.

Waiting in jshell

If you do not specify "NOWAIT", then your shell should not continue until all the processes in the pipe have completed. You'll need a red-black tree for this.

Errors

Your code should work in the face of errors. For example, if you specify a bad output file at the end of a multi-stage pipe, then the error should be noted, and your shell should continue working. Make sure you check for all the error conditions that you can think of.

The Gradescripts

The first 5 gradescripts test single commands.
The next 15 test NOWAIT.
The next 40 test redirection.
The next 40 test everything.

As I said above, you'll do better to develop your own testing code than to use the gradescripts when you develop code. The gradescripts make use of the following programs, which are in the lab directory:

cattostde.c: This works like cat, but it prints what it receives on standard input, or on its input files to standard error.
strays.c: This checks for open file descriptors and will flag an error if any file descriptor higher than three is open. Then it works just like cat
strays-files.c: This works like strays except it copies the first argument to the second.
strays-fsleep.c: This works like strays-files except it sleeps for a 5th of a second before starting.
strays-sleep.c: This works like strays except it sleeps for a 5th of a second before starting.

The gradescripts use all of these to test various features of your shells. Beside the first few gradescripts, each gradescript call will take between 1 and 20 seconds. The gradescripts are time sensitive, too, so the output of your program may change as time passes -- for that reason, the gradescripts can be a little hard to parse.

To help you out, I have made videos to explain gradescripts 6, 21 and 61. They are here:

Gradescript 6: https://youtu.be/5bncxZZkzmU
Gradescript 21: https://youtu.be/DmRRy_h3TY4
Gradescript 61: https://youtu.be/ndeR8cdct6c

My command data structure

You don't have to use this data structure, but I put this here in case you'll find it helpful. This was the data structure that I used to store a command:

typedef struct {
  char *stdin;          /* Filename from which to redirect stdin.  NULL if empty.*/ 
  char *stdout;         /* Filename to which to redirect stdout.  NULL if empty.*/ 
  int append_stdout;    /* Boolean for appending.*/ 
  int wait;             /* Boolean for whether I should wait.*/ 
  int n_commands;       /* The number of commands that I have to execute*/ 
  int *argcs;           /* argcs[i] is argc for the i-th command*/ 
  char ***argvs;        /* argcv[i] is the argv array for the i-th command*/ 
  Dllist comlist;       /* I use this to incrementally read the commands.*/ 
} Command;

A little commentary -- before I read END, I put commands into comlist, and keep track of the number of commands with n_commands. I build the argv array when I read the command, and that's what I put onto the comlist.

When I read END, I create argcs and argvs from comlist, and then delete comlist. Since I'm storing the actual argv arrays in comlist, this is a very simply process of calculating argc, and then copying the pointer to the argv array.

I have a procedure free_command() that frees everything in the data structure at the end of every command. I use this to handle errors while reading.