CS140 Lecture notes -- Recursion

  • James S. Plank
  • Directory: ~plank/cs140/Notes/Recursion
  • Lecture notes: http://web.eecs.utk.edu/~plank/plank/classes/cs140/Notes/Recursion/index.html
  • Last modification: Mon Apr 25 13:52:17 EDT 2011
    Here is some practice material with recursion:

    Recursion

    Recursion is an extremely important programming technique -- one that students seem to have trouble with early. It's a very simple concept. If a language supports recursion (and most of them do, Fortran being a notable exception), then whenever you make a procedure call, the computer stores a few things: It actually stores these things by pushing them onto a stack. Thus, whenever a procedure call returns, it knows what to do by popping off where you are and what your arguments and local variables are.

    This lets you do something very important. It lets you make a call to the same procedure that you are currently running. This runs a second copy of the procedure, which will restore the first copy when it returns.

    Let's take a simple example (in rec1.cpp):

    /* 1 */     void a(int i)
    /* 2 */     {
    /* 3 */       printf("In procedure a: i = %d\n", i);
    /* 4 */       if (i == 10) a(9);
    /* 5 */     }
    /* 6 */
    /* 7 */     main()
    /* 8 */     {
    /* 9 */       a(10);
    /* 10 */    }
    

    You'll note, if i equals 10, then a() calls itself. Let's look at what happens when this is executed. First, we are in main(), and it calls a(10). What happens here is that the computer stores its current context (where it is, and what its local variables are) on the stack. The stack looks like:

    top --> [main(): line 9]

    Then a(10) is executed. It will print:

    In procedure a: i = 10
    
    and then it will call a(9). Once again, the computer stores its current context on the stack. The stack now looks like:

    top --> [a(): line 4, i = 10]
    [main(): line 9]

    Then a(9) is executed. It will print:

    In procedure a: i = 9
    
    Its context is:

    top --> [a(): line 4, i = 9]
    [a(): line 4, i = 10]
    [main(): line 9]

    and then it will return. When it returns, it pops the stack frame for the call to a(9) off the stack. It returns control to the procedure and line that are now at the top of the stack--this is in procedure a() at line 4, with i equal to 10. The stack once again looks like:

    top --> [a(): line 4, i = 10]
    [main(): line 9]

    Now, the first thing that happens is that a(10) returns. Again, it pops the stack frame for the returning procedure off the stack and returns to the line in the procedure at the top of the stack. This is in procedure main() at line 9. Of course, what happens is that main() exits, and the program ends. Thus, the output is:

    In procedure a: i = 10
    In procedure a: i = 9 
    

    A slightly more complex example

    Now, look at rec2.cpp:

    /*     1 */   void a(int i)
    /*     2 */   {
    /*     3 */     int j;
    /*     4 */   
    /*     5 */     j = i*5;
    /*     6 */     printf("In procedure a: i = %d, j = %d\n", i, j);
    /*     7 */     if (i > 0) a(i-1);
    /*     8 */     printf("Later In procedure a: i = %d, j = %d\n", i, j);
    /*     9 */   }
    /*    10 */   
    /*    11 */   main()
    /*    12 */   {
    /*    13 */     int i;
    /*    14 */     
    /*    15 */     i = 16;
    /*    16 */     a(3);
    /*    17 */     printf("main: %d\n", i);
    /*    18 */   }
    

    Again, let's see what happens when it is executed. First, we're in main() which sets i to 16 and calls a(3). This pushes the current context on the stack:

    top --> [main(): line 16, i = 16]

    Now, we execute a(3). This sets j to 15, and prints out:

    In procedure a: i = 3, j = 15
    
    It then calls a(2). This pushes the current context on the stack:

    top --> [a(): line 7, i = 3, j = 15]
    [main(): line 16, i = 16]

    And then we call a(2). This sets j to 10, and prints out:

    In procedure a: i = 2, j = 10
    
    And then it calls a(1). Once again, the current context is pushed onto the stack:

    top --> [a(): line 7, i = 2, j = 10]
    [a(): line 7, i = 3, j = 15]
    [main(): line 16, i = 16]

    And then we execute a(1). This sets j to 5, and prints out:

    In procedure a: i = 1, j = 5
    
    And then it calls a(0). Once again, the current context is pushed onto the stack:

    top --> [a(): line 7, i = 1, j = 5]
    [a(): line 7, i = 2, j = 10]
    [a(): line 7, i = 3, j = 15]
    [main(): line 16, i = 16]

    And then we execute a(0). This sets j to 0, and prints out:

    In procedure a: i = 0, j = 0 
    
    Since i is zero, it skips the body of the if statement, prints out:
    Later In procedure a: i = 0, j = 0
    
    and returns. Now what returning does is restore the top context on the stack, which means that we are in a() at line 7 with i = 1 and j = 5. The stack is now:

    top --> [a(): line 7, i = 1, j = 5]
    [a(): line 7, i = 2, j = 10]
    [a(): line 7, i = 3, j = 15]
    [main(): line 16, i = 16]

    It prints out:

    Later In procedure a: i = 1, j = 5
    
    and a(1) returns. Once again, we restore the top context on the stack, which means that we are in a() at line 7 with i = 2 and j = 10. The stack is now:

    top --> [a(): line 7, i = 2, j = 10]
    [a(): line 7, i = 3, j = 15]
    [main(): line 16, i = 16]

    It prints out:

    Later In procedure a: i = 2, j = 10
    
    and a(2) returns. Once again, we restore the top context on the stack, which means that we are in a() at line 7 with i = 3 and j = 15. The stack is now:

    top --> [a(): line 7, i = 3, j = 15]
    [main(): line 16, i = 16]

    It prints out:

    Later In procedure a: i = 3, j = 15
    
    and a(3) returns. Finally, we restore the last context on the stack, which means that we are in main() at line 16 with i = 16. The stack is now:

    top --> [main(): line 16, i = 16]

    main prints out:

    main: 16
    
    and exits. Thus, the whole output is:
    UNIX> rec2
    In procedure a: i = 3, j = 15
    In procedure a: i = 2, j = 10
    In procedure a: i = 1, j = 5
    In procedure a: i = 0, j = 0
    Later In procedure a: i = 0, j = 0
    Later In procedure a: i = 1, j = 5
    Later In procedure a: i = 2, j = 10
    Later In procedure a: i = 3, j = 15
    main: 16
    UNIX> 
    

    Using gdb to look at the stack

    See this web page for an example of using gdb to look at the stack while rec2.cpp is running.

    Infinite recursion

    Obviously, just like you can write a program that goes into an infinite for() loop, you can write one that goes into an infinite recursive loop, like rec3.cpp:
    a(int i)
    {
      printf("In procedure a: i = %d\n", i);
      a(i);
    }
    
    main()
    {
      a(10);
    }
    
    When you run it, it looks like an infinite loop:
    UNIX> rec3
    In procedure a: i = 10
    In procedure a: i = 10
    In procedure a: i = 10
    In procedure a: i = 10
    ....
    
    One difference between infinite recursion and most infinite loops is that you will run out of stack space eventually with infinite recursion and the program will exit. On my machine, if you remove the print statement from rec3.cpp and run it, it eventually seg faults.

    Standard recursion examples - factorial

    One standard recursion example is computing a factorial of a number. This can be done with a simple while loop as in fact1.cpp:
    int factorial(int i)
    {
      int f;
    
      f = 1;
      while (i > 0) {
        f *= i;
        i--;
      }
    }
    
    However, you can also do it recursively. Remember the definition of factorial: You can write factorial() recursively so that it looks just like that definition. This is in fact2.cpp:
    int factorial(int n)
    {
      int f;
    
      if (n <= 0) return 1;
      return n * factorial(n-1);
    }
    
    Go ahead and run fact1 and fact2 and see that they return the same output. Use gdb to look at the state of fact2 if you're still a little leery of recursion.

    Efficiency

    You should be warned that recursion is not as efficient as using a for() (or while()) loop. An extreme example is that you could implement integer multiplication with a while() loop like the following: (in mult1.cpp):
    int imult(int a, int b)
    {
      int product;
    
      product = 0;
    
      while (b > 0) {
        product += a;
        b--;
      }
      return product;
    }
    
    Try it:
    UNIX> mult1 4 10
    40
    UNIX> mult1 10 4
    40
    UNIX> 
    
    Or you could do that recursively (in mult2.cpp):
    int imult(int a, int b)
    {
      int product;
    
      if (b <= 0) return 0;
      return a + imult(a, b-1);
    }
    
    They both work, but mult1 runs faster because it doesn't have to do those stack operations like mult2 has to. Unfortunately, this is hard to time because if you try to use large values of b, mult2 will run out of stack space and seg fault. One way to time it is to run both a lot of times. If you look at mult1.sh and mult2.sh, these are shell scripts that run mult1 and mult2 20 times. If you time them, you'll see that mult1 is much faster:
    UNIX> time sh mult1.sh
    1000000
    ...
    1000000
    0.0u 0.1s 0:00.33 54.5% 128+225k 0+0io 0pf+0w
    UNIX> time sh mult2.sh
    1000000
    ...
    1000000
    0.7u 0.4s 0:01.36 86.0% 28+1852k 0+0io 0pf+0w
    UNIX> 
    
    What this means is that:

    Standard recursion example: fibonacci numbers

    Fibonacci numbers are a certain class of numbers that have interesting properties. We won't discuss the properties, but we can write programs to calculate them. The definition of fibonacci numbers is recursive: This leads to a very simple recursive implementation of the fibonacci numbers (in fib1.cpp):
    int fibonacci(int n)
    {
      if (n <= 1) return 1;
      return fibonacci(n-1) + fibonacci(n-2);
    }
    
    It works fine for small values of n:
    UNIX> fib1 1
    1
    UNIX> fib1 2
    2
    UNIX> fib1 3
    3
    UNIX> fib1 4
    5
    UNIX> fib1 5
    8
    UNIX> fib1 6
    13
    UNIX> 
    
    However, if you think about it, the running time of this program is brutal. Suppose we only care about the number of times that fibonacci() is called. Let this be F(n). F(0) = 1 and F(1) = 1. F(n) is 1+F(n-1)+F(n-2), which means that F(n) is greater than F(n-1)+F(n-2). So, F(2) > 2, F(3) > 3, F(4) > 5, etc. You'll see F(n) is greater than fib(n). As you will learn later, fib(n) = O(1.6n), which means that the running time of fib1.cpp is exponential, which is terrible. You'll notice that when you get to values of n in the 30's, fib1 starts slowing down incredibly.

    Of course, it doesn't have to be that way. A simple while loop does it in linear time. This is in fib2.cpp, and this one can do fib(40), for example, blazingly fast.

    #include <iostream>
    #include <cstdio>
    #include <cstdlib>
    using namespace std;
    
    int fibonacci(int n)
    {
      int fibim1, fibim2, fibi, i;
    
      if (n <= 1) return 1;
    
      fibim1 = 1;
      fibim2 = 1;
      i = 1;
    
      while (1) {
        i++;
        fibi = fibim1 + fibim2;
        if (i == n) return fibi;
        fibim2 = fibim1;
        fibim1 = fibi;
      }
    }
    
    main(int argc, char **argv)
    {
      if (argc != 2) {
        fprintf(stderr, "usage: fib1 n\n");
        exit(1);
      }
      
      printf("%d\n", fibonacci(atoi(argv[1])));
    }
    


    The Towers of Hanoi

    See this link for a description of the towers of Hanoi (I found this on Swarthmore's web site). This is good conversational fodder for the next time that you are bored at a restaurant -- you can play "Towers of Hanoi" with the onion rings:

    More formally, suppose you have n disks that you want to move from tower 0 to tower 1, and that you are able to use tower 2 in the process. You are only allowed to move a disk at a time, and you can only move a disk from one tower to another if that disk is smaller than the top disk on the destination tower.

    There is a very elegant solution to this. If n is one, then you simply move the disk. Otherwise, you solve the problem for n-1, moving the top n-1 disks to tower 2, and then you move the bottom disk to tower 1. Finally, you use the solution for n-1 to move the n-1 disks from tower 2 to tower 1.

    This maps very well into a recursive subroutine. But before we do that, we need to actually code up the towers of Hanoi.


    Implementation

    Look at towers.h:

    #include <iostream>
    #include <deque>
    #include <cstdio>
    #include <cstdlib>
    using namespace std;
    
    class Towers {
      public:
        Towers(int n);
        void Make_Move(int from, int to);
        void Print();
      protected:
        deque <int> T[3];
    };
    

    This defines a simple Towers class where each tower is represented by a deque. We'd optimally use a stack, but we want to print the towers, so using a deque is easier. We'll push elements on the front and pop them from the front too. Thus, the constructor and Make_Move are pretty simple procedures (in towers.cpp):

    Towers::Towers(int n)
    {
      int i;
    
      for (i = 1; i <= n; i++) T[0].push_back(i);
    }
    
    void Towers::Make_Move(int from, int to)
    {
      if (from < 0 || from > 2) {
        printf("Bad Source Tower (%d)\n", from);
    
      } else if (to < 0 || to > 2) {
        printf("Bad Destination Tower (%d)\n", to);
    
      } else if (T[from].empty()) {
        printf("Can't move from tower %d to %d -- tower %d is empty\n", from, to, from);
    
      } else if (!T[to].empty() && T[from][0] > T[to][0]) {
        printf("Can't move from tower %d to %d -- piece is too big (%d can't go on top of %d)\n", 
          from, to, T[from][0], T[to][0]);
    
      } else {
        printf("Moving %d from tower %d to tower %d\n", T[from][0], from, to);
        T[to].push_front(T[from][0]);
        T[from].pop_front();
      }
    }
    

    To print the towers, I'm going to use ASCII art. See if you can trace through this one:

    void Towers::Print()
    {
      int mx, mxe, dots, spaces;
      int i, j, k;
    
      mx = T[0].size() + T[1].size() + T[2].size();
      mxe = T[0].size();
      if (T[1].size() > mxe) mxe = T[1].size();
      if (T[2].size() > mxe) mxe = T[2].size();
    
      for (i = mxe-1; i >= 0; i--) {
        for (j = 0; j < 3; j++) {
          if (T[j].size() > i) {
            dots = T[j][T[j].size()-i-1];
          } else {
            dots = 0;
          }
          spaces = mx - dots + 1;
          for (k = 0; k < dots; k++) printf(".");
          for (k = 0; k < spaces; k++) printf(" ");
        }
        printf("\n");
      }
      for (j = 0; j < 3; j++) {
        for (i = 0; i < mx; i++) printf("-");
        printf(" ");
      }
      printf("\n");
    }
    

    I have a very simple interactive tower program in tower_play.cpp, which starts with all the rings on tower zero, and then allows you to move a ring at a time by entering the source and destination towers. It prints the towers at the beginning and after each move:

    #include "towers.h"
    
    main(int argc, char **argv)
    {
      Towers *t;
      int npieces;
      int from, to;
    
      if (argc != 2) {
        fprintf(stderr, "usage: tower_play size\n");
        exit(1);
      }
    
      npieces = atoi(argv[1]);
    
      t = new Towers(npieces);
      t->Print();
    
      while (cin >> from >> to) {
        t->Make_Move(from, to);
        printf("\n");
        t->Print();
        printf("\n");
      }
    }
    

    Here's a simple example with four rings:

    UNIX> tower_play 4
    .              
    ..             
    ...            
    ....           
    ---- ---- ---- 
    0 1
    Moving 1 from tower 0 to tower 1
    
    ..             
    ...            
    .... .         
    ---- ---- ---- 
    
    0 2
    Moving 2 from tower 0 to tower 2
    
    ...            
    .... .    ..   
    ---- ---- ---- 
    
    1 2
    Moving 1 from tower 1 to tower 2
    
    ...       .    
    ....      ..   
    ---- ---- ---- 
    
    0 2
    Can't move from tower 0 to 2 -- piece is too big (3 can't go on top of 1)
    
    ...       .    
    ....      ..   
    ---- ---- ---- 
    <CNTL-D>
    UNIX> 
    

    The recursive solution

    In tower_solution.cpp, we solve an instance of the Towers of Hanoi recursively:

    #include <stdio.h>
    #include "towers.h"
    
    void Solve(Towers *t, int from, int to, int npieces)
    {
      int i, other;
    
      if (npieces == 1) {
        t->Make_Move(from, to);
        t->Print();
        return;
      }
    
      for (i = 0; i < 3; i++) if (i != from && i != to) other = i;
      Solve(t, from, other, npieces-1);
      t->Make_Move(from, to);
      t->Print();
      Solve(t, other, to, npieces-1);
    }
    
    main(int argc, char **argv)
    {
      int npieces;
      Towers *t;
    
      if (argc != 2) {
        fprintf(stderr, "usage: tower_solution size\n");
        exit(1);
      }
    
      npieces = atoi(argv[1]);
      if (npieces <= 0) exit(1);
      t = new Towers(npieces);
    
      t->Print();
      Solve(t, 0, 1, npieces);
    }
    

    That's pretty elegant, isn't it? Here it is on a 3-piece tower:

    UNIX> tower_solution 3
    .           
    ..          
    ...         
    --- --- --- 
    Moving 1 from tower 0 to tower 1
    ..          
    ... .       
    --- --- --- 
    Moving 2 from tower 0 to tower 2
    ... .   ..  
    --- --- --- 
    Moving 1 from tower 1 to tower 2
            .   
    ...     ..  
    --- --- --- 
    Moving 3 from tower 0 to tower 1
            .   
        ... ..  
    --- --- --- 
    Moving 1 from tower 2 to tower 0
    .   ... ..  
    --- --- --- 
    Moving 2 from tower 2 to tower 1
        ..      
    .   ...     
    --- --- --- 
    Moving 1 from tower 0 to tower 1
        .       
        ..      
        ...     
    --- --- --- 
    UNIX> 
    
    One of the things that I love about the Towers of Hanoi is that once you figure out the recursion, the solution nearly writes itself. You don't have to think about where you are moving the pieces -- the recursion takes care of it automatically.

    Can you figure out the number of calls to Make_Move() as a function of n? Let's use those great Unix programs grep and wc to help us:

    UNIX> tower_solution 1 | grep Moving
    Moving 1 from tower 0 to tower 1
    UNIX> tower_solution 2 | grep Moving
    Moving 1 from tower 0 to tower 2
    Moving 2 from tower 0 to tower 1
    Moving 1 from tower 2 to tower 1
    UNIX> tower_solution 3 | grep Moving
    Moving 1 from tower 0 to tower 1
    Moving 2 from tower 0 to tower 2
    Moving 1 from tower 1 to tower 2
    Moving 3 from tower 0 to tower 1
    Moving 1 from tower 2 to tower 0
    Moving 2 from tower 2 to tower 1
    Moving 1 from tower 0 to tower 1
    UNIX> tower_solution 3 | grep Moving | wc
           7      56     231
    UNIX> tower_solution 4 | grep Moving | wc
          15     120     495
    UNIX> tower_solution 5 | grep Moving | wc
          31     248    1023
    UNIX> tower_solution 6 | grep Moving | wc
          63     504    2079
    UNIX> 
    
    Let MM(n) be the number of calls to Make_Move() as a function of n. It looks as though:

    MM(n) = 2n-1.

    Can we prove it? Well, it's easy to see that MM(1) = 1. For n > 1, you can see from Solve() that:

    MM(n) = 1 + 2(MM(n-1)).

    So, prove it by induction. If MM(n) = 2n-1 for all values less than n, then :

    1 + 2(MM(n-1)) = 1 + 2(2n-1-1)
    = 1 + 2n-2
    = 2n-1.

    Awesome. Remember that proof. It will be all over those CS31x classes.