CS140 Lecture notes -- Recursion

Directory: ~cs140/www-home/notes/Recursion

Lecture notes: http://www.cs.utk.edu/~cs140/notes/Recursion

Wed Nov 11 11:01:24 EST 1998

Recursion

Recursion is an extremely important programming technique -- one that students seem to have trouble with early. It's a very simple concept. If a language supports recursion (and most of them do, Fortran being a notable exception), then whenever you make a procedure call, the computer stores a few things:

All your arguments and local variables.
Your place in the current procedure.

It actually stores these things by pushing them onto a stack. Thus, whenever a procedure call returns, it knows what to do by popping off where you are and what your arguments and local variables are.

This lets you do something very important. It lets you make a call to the same procedure that you are currently in. This runs a second copy of the procedure, which will restore the first copy when it returns.

Let's take a simple example (in rec1.c):

/* 1 */     a(int i)
/* 2 */     {
/* 3 */       printf("In procedure a: i = %d\n", i);
/* 4 */       if (i == 10) a(9);
/* 5 */     }
/* 6 */
/* 7 */     main()
/* 8 */     {
/* 9 */       a(10);
/* 10 */    }

You'll note, if i equals 10, then a() calls itself. Let's look at what happens when this is executed. First, we are in main(), and it calls a(10). What happens here is that the computer stores its current context (where it is, and what its local variables are) on the stack. The stack looks like:

top -->

[main(): line 9]

Then a(10) is executed. It will print:

In procedure a: i = 10

and then it will call a(9). Once again, the computer stores its current context on the stack. The stack now looks like:

top -->	[a(): line 4, i = 10]
	[main(): line 9]

Then a(9) is executed. It will print:

In procedure a: i = 9

and then it will return. When it returns, it pops where it should return off the stack -- this is in procedure a() at line 4, with i equal to 10. The stack once again looks like:

top -->

[main(): line 9]

Now, the first thing that happens is that a(10) returns. Again, it pops where it should return off the stack -- this is in procedure main() at line 9. Of course, what happens is that main() exits, and the program ends. Thus, the output is:

In procedure a: i = 10
In procedure a: i = 9

A slightly more complex example

Now, look at rec2.c:

/*     1 */   a(int i)
/*     2 */   {
/*     3 */     int j;
/*     4 */   
/*     5 */     j = i*5;
/*     6 */     printf("In procedure a: i = %d, j = %d\n", i, j);
/*     7 */     if (i > 0) a(i-1);
/*     8 */     printf("Later In procedure a: i = %d, j = %d\n", i, j);
/*     9 */   }
/*    10 */   
/*    11 */   main()
/*    12 */   {
/*    13 */     int i;
/*    14 */     
/*    15 */     i = 16;
/*    16 */     a(3);
/*    17 */     printf("main: %d\n", i);
/*    18 */   }

Again, let's see what happens when it is executed. First, we're in main() which sets i to 16 and calls a(3). This pushes the current context on the stack:

top -->

[main(): line 16, i = 16]

Now, we execute a(3). This sets j to 15, and prints out:

In procedure a: i = 3, j = 15

It then calls a(2). This pushes the current context on the stack:

top -->	[a(): line 7, i = 3, j = 15]
	[main(): line 16, i = 16]

And then we call a(2). This sets j to 10, and prints out:

In procedure a: i = 2, j = 10

And then it calls a(1). Once again, the current context is pushed onto the stack:

top -->	[a(): line 7, i = 2, j = 10]
	[a(): line 7, i = 3, j = 15]
	[main(): line 16, i = 16]

And then we execute a(1). This sets j to 5, and prints out:

In procedure a: i = 1, j = 5

And then it calls a(0). Once again, the current context is pushed onto the stack:

top -->	[a(): line 7, i = 1, j = 5]
	[a(): line 7, i = 2, j = 10]
	[a(): line 7, i = 3, j = 15]
	[main(): line 16, i = 16]

And then we execute a(0). This sets j to 0, and prints out:

In procedure a: i = 0, j = 0

Since i is zero, it skips the body of the if statement, prints out:

Later In procedure a: i = 0, j = 0

and returns. Now what returning does is restore the top context on the stack, which means that we are in a() at line 7 with i = 1 and j = 5. The stack is now:

top -->	[a(): line 7, i = 2, j = 10]
	[a(): line 7, i = 3, j = 15]
	[main(): line 16, i = 16]

It prints out:

Later In procedure a: i = 1, j = 5

and a(1) returns. Once again, we restore the top context on the stack, which means that we are in a() at line 7 with i = 2 and j = 10. The stack is now:

top -->	[a(): line 7, i = 3, j = 15]
	[main(): line 16, i = 16]

It prints out:

Later In procedure a: i = 2, j = 10

and a(2) returns. Once again, we restore the top context on the stack, which means that we are in a() at line 7 with i = 3 and j = 15. The stack is now:

top -->

[main(): line 16, i = 16]

It prints out:

Later In procedure a: i = 3, j = 15

and a(3) returns. Finally, we restore the last context on the stack, which means that we are in main() at line 16 with i = 16. The stack is now empty. It prints out:

main: 16

and exits. Thus, the whole output is:

In procedure a: i = 3, j = 15
In procedure a: i = 2, j = 10
In procedure a: i = 1, j = 5
In procedure a: i = 0, j = 0
Later In procedure a: i = 0, j = 0
Later In procedure a: i = 1, j = 5
Later In procedure a: i = 2, j = 10
Later In procedure a: i = 3, j = 15
main: 16

Using gdb to look at the stack

See this web page for an exmaple of using gdb to look at the stack while rec2.c is running.

Infinite recursion

Obviously, just like you can write a program that goes into an infinite for() loop, you can write one that goes into an infinite recursive loop, like rec3.c:

a(int i)
{
  printf("In procedure a: i = %d\n", i);
  a(i);
}

main()
{
  a(10);
}

When you run it, it looks like an infinite loop:

UNIX> rec3
In procedure a: i = 10
In procedure a: i = 10
In procedure a: i = 10
In procedure a: i = 10
....

One difference between infinite recursion and most infinite loops is that you will run out of stack space eventually with infinite recursion and the program will exit. On my machine, if you remove the print statement from rec3.c and run it, it eventually seg faults.

Standard recursion examples - factorial

One standard recursion example is computing a factorial of a number. This can be done with a simple while loop as in fact1.c:

int factorial(int i)
{
  int f;

  f = 1;
  while (i > 0) {
    f *= i;
    i--;
  }
}

However, you can also do it recursively. Remember the definition of factorial:

0! = 1
If n > 0, n! = n * (n-1)!

You can write factorial() recursively so that it looks just like that definition. This is in fact2.c:

int factorial(int n)
{
  int f;

  if (n <= 0) return 1;
  return n * factorial(n-1);
}

Go ahead and run fact1 and fact2 and see that they return the same output. Use gdb to look at the state of fact2 if you're still a little leery of recursion.

Efficiency

You should be warned that recursion is not as efficient as using a for() (or while()) loop. An extreme example is that you could implement integer multiplication with a while() loop like the following: (in mult1.c):

int imult(int a, int b)
{
  int product;

  product = 0;

  while (b > 0) {
    product += a;
    b--;
  }
  return product;
}

Try it:

UNIX> mult1 4 10
40
UNIX> mult1 10 4
40
UNIX>

Or you could do that recursively (in mult2.c):

int imult(int a, int b)
{
  int product;

  if (b <= 0) return 0;
  return a + imult(a, b-1);
}

They both work, but mult1 runs faster because it doesn't have to do those stack operations like mult2 has to. Unfortunately, this is hard to time because if you try to use large values of b, mult2 will run out of stack space and seg fault. One way to time it is to run both a lot of times. If you look at mult1.sh and mult2.sh, these are shell scripts that run mult1 and mult2 20 times. If you time them, you'll see that mult1 is much faster:

UNIX> time sh mult1.sh
1000000
...
1000000
0.0u 0.1s 0:00.33 54.5% 128+225k 0+0io 0pf+0w
UNIX> time sh mult2.sh
1000000
...
1000000
0.7u 0.4s 0:01.36 86.0% 28+1852k 0+0io 0pf+0w
UNIX>

What this means is that:

You shouldn't go crazy with recursion -- don't use recursion in tight loops which could be put into for() and while() loops easily.
However, there are times when the ease of using recursion far outweighs the inefficiencies of stack management. This is especially true when the body of the recursive program does a non-trivial amount of work, and therefore that the stack management is not a significant factor in the running time of the program.

Standard recursion example: fibonacci numbers

Fibonacci numbers are a certain class of numbers that have interesting properties. We won't discuss the properties, but we can write programs to calculate them. The definition of fibonacci numbers is recursive:

fib(0) = 1.
fib(1) = 1.
If n > 1, fib(n) = fib(n-1) + fib(n-2).

This leads to a very simple recursive implementation of the fibonacci numbers (in fib1.c):

int fibonacci(int n)
{
  if (n <= 1) return 1;
  return fibonacci(n-1) + fibonacci(n-2);
}

It works fine for small values of n:

UNIX> fib1 1
1
UNIX> fib1 2
2
UNIX> fib1 3
3
UNIX> fib1 4
5
UNIX> fib1 5
8
UNIX> fib1 6
13
UNIX>

However, if you think about it, the running time of this program is brutal. Suppose we only care about the number of times that fibonacci() is called. Let this be F(n). F(0) = 1 and F(1) = 1. F(n) is 1+F(n-1)+F(n-2), which means that F(n) is greater than F(n-1)+F(n-2). So, F(2) > 2, F(3) > 3, F(4) > 5, etc. You'll see F(n) is greater than fib(n). As the book will tell you, fib(n) = O(3/5)^n, which means that the running time of fib1.c is exponential, which is terrible. You'll notice that when you get to values of n in the 30's, fib1 starts slowing down incredibly.

Of course, it doesn't have to be that way. A simple while loop does it in O(n) time. This is in fib2.c, and this one can do fib(40), for example, blazingly fast.

int fibonacci(int n)
{
  int fibim1, fibim2, fibi, i;

  if (n <= 1) return 1;

  fibim1 = 1;
  fibim2 = 1;
  i = 1;

  while (1) {
    i++;
    fibi = fibim1 + fibim2;
    if (i == n) return fibi;
    fibim2 = fibim1;
    fibim1 = fibi;
  }
}

Note, that the book discusses recursion in section 1.3, and fibonacci numbers in section 2.4.2.

The Towers of Hanoi

See this link for a description of the towers of Hanoi (I found this on Swarthmore's web site). Suppose you have n disks that you want to move from tower 0 to tower 1, and that you are able to use tower 2 in the process. You are only allowed to move a disk at a time, and you can only move a disk from one tower to another if that disk is smaller than the top disk on the destination tower.

There is a very elegant solution to this. If n is one, then you simply move the disk. Otherwise, you solve the problem for n-1, moving the top n-1 disks to tower 2, and then you move the bottom disk to tower 1. Finally, you use the solution for n-1 to move the n-1 disks from tower 2 to tower 1.

This maps very well into a recursive subroutine. But before we do that, we need to actually code up the towers of Hanoi.

Implementation

Look at towers.h and towers.c. I am going to represent the towers as an array of three dllists. Each node on the dllist is a disk, and is represented by an integer which is the disk's size. The first element of the list will be the top disk on the tower, and the last element will be the bottom disk.

In towers.c, I implement three procedures. The first is:

Dllist *new_towers(int n);

which creates and returns the array of three dllists. Tower 0 will have n disks on it. This is, of course, straightforward.

Dllist *new_towers(int npiece)
{
  Dllist *t;
  int i;
  int piece;

  /* Allocate the array */
  t = (Dllist *) malloc(sizeof(Dllist)*3);

  /* Create the dllists */
  for (i = 0; i < 3; i++)  t[i] = new_dllist();

  /* Put the disks onto tower 0 */
  for (piece = 1; piece <= npiece; piece++) {
    dll_append(t[0], new_jval_i(piece));
  }

  /* Return the towers */
  return t;
}

Next, we write make_move() which moves the top piece from one tower to another. This one has to do error checking to make sure that the source tower is not empty, and that the destination tower does not have a disk on it that is too small. The code below does this. It also prints out the move:

make_move(Dllist *towers, int from, int to)
{
  int piece, topofto;

  /* Error check -- is the first tower empty? */

  if (dll_empty(towers[from])) {
    printf("Illegal move of tower %d to %d\n", from, to);
    return;
  }

  piece = jval_i(dll_val(dll_first(towers[from])));

  /* Error check -- is the piece too big to go on the destination tower? */

  if (!dll_empty(towers[to])) {
    topofto = jval_i(dll_val(dll_first(towers[to])));
    if (piece > topofto) {
      printf("Illegal move of tower %d to %d\n", from, to);
      return;
    }
  }

  /* Move the piece from the first tower to the second */

  dll_delete_node(dll_first(towers[from]));
  dll_prepend(towers[to], new_jval_i(piece));

  /* Print that the piece has moved */

  printf("Moved piece %3d from tower %d to tower %d\n",
          piece, from, to);
}

Finally, we write print_towers() which prints out the towers. This one is really simple:

print_towers(Dllist *towers)
{
  int i;
  Dllist tmp;

  for (i = 0; i < 3; i++) {
    printf("Tower %d:", i);
    dll_rtraverse(tmp, towers[i]) {
      printf("%3d", jval_i(dll_val(tmp)));
    }
    printf("\n");
  }
}

One of the points of all this is that you should be able to take a problem description like the Towers of Hanoi, and code it up. If you have your data structures right (like here, using an array of three dllists), writing the code is simple.

Interactive Tower Game

In tower_play.c, I've written up a little interactive Tower of Hanoi game. It is very, very simple, and uses the routines from tower.c. Try it out.

The Solution

In tower_solution.c is the solution to the Towers of Hanoi. It is a beautiful use of recursion, that follows the plan -- move n-1 disks to the other tower, move the bottom disk to the destination tower, then move the n-1 disks to the destination tower. Here's the code:

solve_tower(Dllist *towers, int from, int to, int npieces)
{
  int other;
  int i;

  /* Find the identity of the other tower */

  for (i = 0; i < 3; i++) {
    if (from != i && to != i) other = i;
  }

  /* Move the top n-1 pieces to the other tower */

  if (npieces > 1) {
    solve_tower(towers, from, other, npieces-1);
  }

  /* Move the remaining piece to the destination */

  make_move(towers, from, to);

  /* Move the top n-1 pieces onto the destination */

  if (npieces > 1) {
    solve_tower(towers, other, to, npieces-1);
  }
}

Try it out. Can you figure out the number of calls to make_move() as a function of n?

UNIX> tower_solution 1
Tower 0:  1
Tower 1:
Tower 2:
Moved piece   1 from tower 0 to tower 1
Tower 0:
Tower 1:  1
Tower 2:
UNIX> tower_solution 2
Tower 0:  2  1
Tower 1:
Tower 2:
Moved piece   1 from tower 0 to tower 2
Moved piece   2 from tower 0 to tower 1
Moved piece   1 from tower 2 to tower 1
Tower 0:
Tower 1:  2  1
Tower 2:
UNIX> tower_solution 3
Tower 0:  3  2  1
Tower 1:
Tower 2:
Moved piece   1 from tower 0 to tower 1
Moved piece   2 from tower 0 to tower 2
Moved piece   1 from tower 1 to tower 2
Moved piece   3 from tower 0 to tower 1
Moved piece   1 from tower 2 to tower 0
Moved piece   2 from tower 2 to tower 1
Moved piece   1 from tower 0 to tower 1
Tower 0:
Tower 1:  3  2  1
Tower 2:
UNIX>