CS302 Lecture Notes - Recursion Review

This is review and reinforcement of recursion. We'll go over three problems that involve recursion in varying levels of detail. For primary lecture notes on recursion, please see my CS140 Lecture notes on recursion.

#1: Topcoder SRM 355, D2, 550-point problem

As always, I don't re-post their problems. If their server is down, read the problem description from http://community.topcoder.com/stat?c=problem_statement&pm=7759&rd=10712. The gist is, given two numbers, l and h, determine the minimum number of digits that equal eight for all numbers between l and h (inclusive).

Examples:

The brain dead way to do this is to iterate from l to h, counting eights. Unfortunately, the constraints on the problem say that l and h can be up to 2,000,000,000, so that's too slow.

The key insight is to look at l and h as strings with equal numbers of digits. Then, the common prefixes of l and h allow us to determine the minimum number of eights. The problem description says that h will be at most a 10-digit number, so simply convert both to 10-digit strings that represent the numbers with leading zeros.

In other words, if l equals 8 and h equals 20, then convert l to "0000000008" and h to "0000000020".

Now, look at the first digit of both numbers. Call them h[0] and l[0]. If both equal '8', then every number between h and l has to start with '8'. If we remove the '8' from both strings and solve the problem recursively, then our answer is one plus the answer of the recursive problem.

Instead, suppose they both do not equal '8', but they do equal each other. If we remove the digit from both numbers and solve it recursively, then we have the answer.

Suppose they do not equal each other, and h[0] does not equal '8'. Then, you know the number beginning with h[0] and having zero's in every other digit is between l and h, and it has zero 8's. You can return zero.

Suppose they do not equal each other, and h[0] equals '8'. Then, you know the number beginning with l[0] and having nines in every other digit is between l and h, and it has zero 8's. You can return zero again.

This maps itself to a straightforward solution. It is in SRM-355-D2-550.cpp, and a main for compiling it is in SRM-355-D2-550-Main.cpp Here's the core of SRM-355-D2-550.cpp

int NE(string &l, string &h, int index)
{
  if (index == l.size()) return 0;     // Base case when we have no digits left.
  if (l[index] != h[index]) return 0;

  if (l[index] == '8') return 1 + (NE(l, h, index+1));
  return 0 + (NE(l, h, index+1));
}

int NoEights::smallestAmount(int low, int high)
{
  int i;
  char b[20];
  string l, h;

  sprintf(b, "%010d", low);    // Conversion to 10 digit strings with leading zeros
  l = b;
  sprintf(b, "%010d", high);
  h = b;
  return NE(l, h, 0);
}

Now, you could have solved that with a for loop, but sometimes it's easier to think recursively. What's the running time? It's O(n), where n is the length of the string.


#2: Topcoder SRM 351, D1, 250-point problem

The problem description is available at http://community.topcoder.com/stat?c=problem_statement&pm=7773&rd=10675.

We are given six numbers: G1, S1 and B1, representing the number of gold, silver and bronze coins that we currently have, and G2, S2 and B2, representing the number of gold, silver and bronze coins that we want to have. We have exchange rates:

We are to return the minimum number of exchanges that we need to perform to get at least G2/S2/B2 from G1/S1/B1.

This is a problem where you break it into sub-problems and solve them recursively. That's a lot easier than trying to think things like: "If G2 is greater than G1 but 11*(G2-G1) is greater than (S1-S2), then.....".

Instead, concentrate on how to convert the problem into easier recursive problems. Start with gold. If G2 is greater than G1, then you need 11*(G2-G1) extra silver. So, simply solve the recursive problem where G1 and G2 equal zero, and S2 is increased by 11*(G2-G1). Let that solution be s. If s is -1, then there's no solution. Otherwise, return s plus (G2-G1) exchanges.

Similarly, now that we're done with gold, let's concentrate on bronze. If B2 is greater than B1, then you need enough silver to get (B2-B1) bronze. Let that amount be x. You solve the recursive problem with B1 and B2 equal to zero, and S2 is increased by x. Let that solution be s. then the final solution is s+x.

Now we're done with gold and bronze. Let's concentrate on silver. If S2 is less than or equal to S1, then we're done -- return zero. Otherwise, we get the most bang for our buck by converting gold to silver. See if there's enough gold. If so, return how many conversions are necessary. If not, convert as much gold as you can and do the rest from bronze. If that's impossible, return -1.

That solves all cases. For the code, see SRM-351-D1-250.cpp and in SRM-351-D1-250-Main.cpp I have annotated the code to print out what it's doing, in case you find this a little confusing. It is worth the effort to trace through the recursion:

UNIX> g++ SRM-351-D1-250-Main.cpp
UNIX> a.out 0
 We have:  G1:   1   S1:   0   B1:   0
 We want:  G2:   0   S2:   0   B2:  81
 Recursively trying to get 81 bronze from 9 silver

**** We have:  G1:   1   S1:   0   B1:   0
**** We want:  G2:   0   S2:   9   B2:   0
**** We can satisfy silver from gold, returning 1

 After recursively trying to get bronze, returning 10

10
UNIX> a.out 1
 We have:  G1:   1   S1: 100   B1:  12
 We want:  G2:   5   S2:  53   B2:  33
 Recursively trying to get 4 gold from 44 silver

**** We have:  G1:   0   S1: 100   B1:  12
**** We want:  G2:   0   S2:  97   B2:  33
**** Recursively trying to get 21 bronze from 3 silver

******** We have:  G1:   0   S1: 100   B1:   0
******** We want:  G2:   0   S2: 100   B2:   0
******** We have enough silver, returning 0

**** After recursively trying to get bronze, returning 3

 After recursively trying to get gold, returning 7

7
UNIX> a.out 2
 We have:  G1:   1   S1: 100   B1:  12
 We want:  G2:   5   S2:  63   B2:  33
 Recursively trying to get 4 gold from 44 silver

**** We have:  G1:   0   S1: 100   B1:  12
**** We want:  G2:   0   S2: 107   B2:  33
**** Recursively trying to get 21 bronze from 3 silver

******** We have:  G1:   0   S1: 100   B1:   0
******** We want:  G2:   0   S2: 110   B2:   0
******** We don't have enough gold and bronze to get silver: returning -1

**** After recursively trying to get bronze, returning -1

 After recursively trying to get gold, returning -1

-1
UNIX> 

#3: Solving Sudoku Puzzles

I assume everyone knows what sudoku is, but if you don't, read Wikipedia's page. Since the problems are pretty small, it's very easy to write a brain-dead recursive Sudoku solver, and for hard problems, it's easier to write the program than it is to solve the puzzle by hand!

We'll build a solution. First have to read a problem in -- I'll do that from standard input -- numbers are '1' through '9', empty cells are '-' and everything else is ignored. I store a puzzle in a vector of nine strings, each with nine characters. I do this in a Read() method of a class called Sudoku, and I also implement a Print() method in sudoku1.cpp:

class Sudoku {
  public:
    vector <string> puzzle;
    void Read();
    void Print();
};

void Sudoku::Read() 
{
  int i, j;
  char c;

  puzzle.clear();
  puzzle.resize(9);

  for (i = 0; i < 9; i++) {
    for (j = 0; j < 9; j++) {
      do {
        if (!(cin >> c)) { 
          cerr << "Not enough cells.\n";
          exit(1);
        }
      } while (isspace(c));
      if (c != '-' && (c < '1' || c > '9')) {
        cerr << "Bad character " << c << endl;
        exit(1);
      }
      puzzle[i].push_back(c);
    }
  }
}
void Sudoku::Print() 
{
  int i, j;

  for (i = 0; i < puzzle.size(); i++) {
    for (j = 0; j < puzzle[i].size(); j++) {
      cout << puzzle[i][j];
      if (j == 2 || j == 5) cout << " ";
    }
    cout << endl;
    if (i == 2 || i == 5) cout << endl;
  }
}
  
main()
{
  Sudoku S;

  S.Read();
  S.Print();
}

I have the example from the Wikipedia page in two files: sudex1.txt and sudex2.txt. They differ in the amount of whitespace. However, when the program reads them in, the produce the same output:

UNIX> g++ -o sudoku1 sudoku1.cpp
UNIX> cat sudex1.txt
53--7----
6--195---
-98----6-
8---6---3
4--8-3--1
7---2---6
-6----28-
---419--5
----8--79
UNIX> sudoku1 < sudex1.txt
53- -7- ---
6-- 195 ---
-98 --- -6-

8-- -6- --3
4-- 8-3 --1
7-- -2- --6

-6- --- 28-
--- 419 --5
--- -8- -79
UNIX>
UNIX> cat sudex2.txt
5 3 -   - 7 -   - - - 
6 - -   1 9 5   - - - 
- 9 8   - - -   - 6 - 

8 - -   - 6 -   - - 3 
4 - -   8 - 3   - - 1 
7 - -   - 2 -   - - 6 

- 6 -   - - -   2 8 - 
- - -   4 1 9   - - 5 
- - -   - 8 -   - 7 9 
UNIX> sudoku1 < sudex2.txt
53- -7- ---
6-- 195 ---
-98 --- -6-

8-- -6- --3
4-- 8-3 --1
7-- -2- --6

-6- --- 28-
--- 419 --5
--- -8- -79
UNIX> 

As a next step, we implement methods to check whether rows, columns or panels are valid. They are straightforward. In sudoku2.cpp, we check to see whether the input matrix is indeed valid.

class Sudoku {
  public:
    vector <string> puzzle;
    void Read();
    void Print();
    int row_ok(int r);
    int column_ok(int c);
    int panel_ok(int pr, int pc);
};

int Sudoku::row_ok(int r)
{
  vector <int> checker;
  int c;

  checker.clear();
  checker.resize(10, 0);
  for (c = 0; c < 9; c++) {
    if (puzzle[r][c] != '-') {
      if (checker[puzzle[r][c]-'0']) return 0;
      checker[puzzle[r][c]-'0'] = 1;
    }
  }
  return 1;
}

int Sudoku::column_ok(int c)
{
  vector <int> checker;
  int r;

  checker.resize(10, 0);
  for (r = 0; r < 9; r++) {
    if (puzzle[r][c] != '-') {
      if (checker[puzzle[r][c]-'0']) return 0;
      checker[puzzle[r][c]-'0'] = 1;
    }
  }
  return 1;
}
int Sudoku::panel_ok(int pr, int pc)
{
  vector <int> checker;
  int r, c;
  int i, j;

  checker.resize(10, 0);
  for (i = 0; i < 3; i++) {
    for (j = 0; j < 3; j++) {
      r = pr*3+i;
      c = pc*3+j;
      if (puzzle[r][c] != '-') {
        if (checker[puzzle[r][c]-'0']) return 0;
        checker[puzzle[r][c]-'0'] = 1;
      }
    }
  }
  return 1;
}

main()
{
  int r, c;
  Sudoku S;

  S.Read();

  for (r = 0; r < 9; r++) if (!S.row_ok(r)) printf("Bad row %d\n", r);
  for (c = 0; c < 9; c++) if (!S.column_ok(c)) printf("Bad col %d\n", c);
  for (r = 0; r < 3; r++) for (c = 0; c < 3; c++) {
    if (!S.panel_ok(r, c)) printf("Bad panel %d %d\n", r, c);
  }
}

I have some example puzzles (sudex3.txt, sudex4.txt & sudex5.txt) with errors: the program correctly identifies them:

UNIX> g++ -o sudoku2 sudoku2.cpp
UNIX> sudoku2 < sudex3.txt
Bad row 3
UNIX> sudoku2 < sudex4.txt
Bad col 7
UNIX> sudoku2 < sudex5.txt
Bad panel 1 2
UNIX> 
Now, this gives us all the pieces to write a really brain-dead recursive solver. What it does is the following:

If it is called on a filled puzzle, we're done -- we print it and exit. The code is in sudoku3.cpp:

class Sudoku {
  public:
    vector <string> puzzle;
    void Read();
    void Print();
    void Solve();
    int row_ok(int r);
    int column_ok(int c);
    int panel_ok(int pr, int pc);
};

void Sudoku::Solve()
{
  int r, c, i;

  for (r = 0; r < 9; r++) {
    for (c = 0; c < 9; c++) {
      if (puzzle[r][c] == '-') {
        for (i = '1'; i <= '9'; i++) {
          puzzle[r][c] = i;
          if (row_ok(r) && column_ok(c) && panel_ok(r/3, c/3)) Solve();
        }
        puzzle[r][c] = '-';
        return;
      }
    }
  }
  Print();
  exit(0);
}

It works on our example, pretty quickly:

UNIX> time sudoku3 < sudex1.txt
534 678 912
672 195 348
198 342 567

859 761 423
426 853 791
713 924 856

961 537 284
287 419 635
345 286 179
0.043u 0.001s 0:00.04 100.0%	0+0k 0+0io 0pf+0w
UNIX> 
I find that a little depressing, actually, that a program that brain-dead can solve a puzzle in seconds that may take me 10+ minutes of logic and head-scratching.

However, if you're like me, it seems like we could speed this up. Let's explore.


Speeding it up a little

First, we need a way to test speed. What I've done is grab six "Evil" puzzles from a web site. They are in test_puzzle_1.txt, test_puzzle_2.txt, test_puzzle_3.txt, test_puzzle_4.txt, test_puzzle_5.txt and test_puzzle_6.txt. I first compile the program using optimization, and then I time it on the six programs with a fancy shell command:
UNIX> g++ -O3 -o sudoku3 sudoku3.cpp
UNIX> time sh -c 'for i in 1 2 3 4 5 6 ; do sudoku3 < test_puzzle_$i.txt > /dev/null; done'
0.430u 0.000s 0:00.42 102.3%  0+0k 0+0io 0pf+0w
UNIX> 
Roughly 0.07 seconds for each test. What's one easy way to speed this up? Well, it seems a bit inefficient to look for a blank space from the beginning each time we call Solve(). Let's instead parameterize Solve() with the row and column, and then when we call it recursively, we give it the next cell. The updated Solve() is in sudoku4.cpp:

void Sudoku::Solve(int r, int c)
{
  int i;

  if (c == 9) { c = 0; r++; }

  while (r < 9) {
    if (puzzle[r][c] == '-') {
      for (i = '1'; i <= '9'; i++) {
        puzzle[r][c] = i;
        if (row_ok(r) && column_ok(c) && panel_ok(r/3, c/3)) Solve(r, c+1);
      }
      puzzle[r][c] = '-';
      return;
    }
    c++;
    if (c == 9) { c = 0; r++; }
  }
  Print();
  exit(0);
}

We first call it with Solve(0, 0). Does it speed things up? A little:

UNIX> g++ -O3 -o sudoku4 sudoku4.cpp
UNIX> time sh -c 'for i in 1 2 3 4 5 6 ; do sudoku4 < test_puzzle_$i.txt > /dev/null; done'
0.420u 0.000s 0:00.40 105.0%  0+0k 0+0io 0pf+0w
UNIX> 
I'm surprised that it doesn't speed matters up more. Whatever. Let's try something more drastic. For each row, column and panel, let's keep a set of valid numbers that can be entered. Then, we have two potential speed-ups. First, when we want to test an empty cell, we can traverse the legal values for the cell's row, then test the column & panel sets to see if the value is legal for those too. If so, we can call Solve() recursively. That eliminates the calls to row_ok(), column_ok() and panel_ok().

The code is a bit icky -- it's in sudoku5.cpp. First, here's the updated class definition:

typedef set <int> ISet;
typedef vector <ISet> VISet;

class Sudoku {
  public:
    vector <string> puzzle;
    void Read();
    void Print();
    void Solve(int r, int c);
    int row_ok(int r);
    int column_ok(int c);
    int panel_ok(int pr, int pc);
    vector <ISet> vrows;
    vector <ISet> vcols;
    vector <VISet> vpanels;
};

And here's Solve():

void Sudoku::Solve(int r, int c)
{
  int i, j, e;
  vector <int> to_try;
  ISet::iterator rit, cit, pit;

  if (r == 0 && c == 0) {
    vrows.resize(9);
    vcols.resize(9);
    vpanels.resize(3);
    for (i = 0; i < 3; i++) vpanels[i].resize(3);

    for (i = 0; i < 9; i++) {
      for (j = '1'; j <= '9'; j++) {
        vrows[i].insert(j);
        vcols[i].insert(j);
        vpanels[i/3][i%3].insert(j);
      }
    }
    for (i = 0; i < 9; i++) {
      for (j = 0; j < 9; j++) {
        if (puzzle[i][j] != '-') {
          e = puzzle[i][j];
          vrows[i].erase(vrows[i].find(e));
          vcols[j].erase(vcols[j].find(e));
          vpanels[i/3][j/3].erase(vpanels[i/3][j/3].find(e));
        }
      }
    }
  }

  if (c == 9) { c = 0; r++; }

  while (r < 9) {
    if (puzzle[r][c] == '-') {
      for(rit = vrows[r].begin(); rit != vrows[r].end(); rit++) to_try.push_back(*rit);
      for (i = 0; i < to_try.size(); i++) {
        e = to_try[i];
        cit = vcols[c].find(e);
        if (cit != vcols[c].end()) {
          pit = vpanels[r/3][c/3].find(e);
          if (pit != vpanels[r/3][c/3].end()) {
            rit = vrows[r].find(e);
            vrows[r].erase(rit);
            vcols[c].erase(cit);
            vpanels[r/3][c/3].erase(pit);
            puzzle[r][c] = e;
            Solve(r, c+1);
            vrows[r].insert(e);
            vcols[c].insert(e);
            vpanels[r/3][c/3].insert(e);
          }
        }
      }
      puzzle[r][c] = '-';
      return;
    }
    c++;
    if (c == 9) { c = 0; r++; }
  }
  Print();
  exit(0);
}

The code is rather straightforward. The only subtlety that I see is using to_try. Why did I do this? Why didn't I simply use rit to traverse vrows[r]? The reason is that I potentially erase rit inside the loop -- once I do that, I invalidate rit, which would be problematic inside a for loop that uses rit. Yes, I could store rit++ and change the loop -- that's probably faster; however, using to_try doesn't seem like a bad alternative.

Is it faster? Let's see:

UNIX> g++ -O3 -o sudoku5 sudoku5.cpp
UNIX> time sh -c 'for i in 1 2 3 4 5 6 ; do sudoku5 < test_puzzle_$i.txt > /dev/null; done'
0.210u 0.010s 0:00.18 122.2%  0+0k 0+0io 0pf+0w
UNIX> 
Well, it runs in 50 percent of the time of sudoku4, so I guess I should be happy. I'm not really, but I'll pretend. I don't think it's a good use of class time to keep twiddling with this to make it faster; however, I would encourage you to give it a try if it intrigues you. There are lots of things to try -- for example, sudoku6.cpp creates to_try from the smallest of vrows[r], vcols[c] and vpanels[r/3][c/3]:
UNIX> g++ -O3 -o sudoku6 sudoku6.cpp
UNIX> time sh -c 'for i in 1 2 3 4 5 6 ; do sudoku6 < test_puzzle_$i.txt > /dev/null; done'
0.190u 0.000s 0:00.17 111.7%  0+0k 0+0io 0pf+0w
UNIX> 
Let's try something else....

Using integers and bit operations as sets

Before you read this, make sure you understand bit operations. Please see the lecture notes on bit operations if you need to brush up.

As illustrated in those lectures, you can use integers and bit operations to represent sets. This actually simplifies the code above. In sudoku7.cpp, we do this. Actually, we do quite a bit more. First, we use the numbers 1 through 9 in the puzzle rather than their characters. We represent '-' with 0. The sets are now vectors of integers rather than sets. We also have the panel set be a flat vector of nine elements. We use the procedure rctoindex() to convert row and column indices to a single index for this vector.

Now, to create the initial sets for rows, columns and panels, we do the following: We first create sets of the numbers that are in each row/column/panel, and then we take their complement so that we have sets of the numbers that are not in each row/column/panel:

// In the constructor:

  RS.resize(9, 0);
  CS.resize(9, 0);
  PS.resize(9, 0);

  for (i = 0; i < 9; i++) {
    for (j = 0; j < 9; j++) {
      if (P[i][j] != 0) {
        RS[i] |= (1 << P[i][j]);
        CS[j] |= (1 << P[i][j]);
        PS[rctoindex(i, j)] |= (1 << P[i][j]);
      }
    }
  }

  for (i = 0; i < 9; i++) {
    RS[i] = ~RS[i];
    CS[i] = ~CS[i];
    PS[i] = ~PS[i];
  }
}

Then the solver takes the intersection of the three sets, and only puts elements that are in that intersection into the recursive tester:

int Sudoku::Solve(int r, int c)
{
  int i, j;

  while (r < 9) {
    while (c < 9) {
      if (P[r][c] == 0) {
        j = (RS[r] & CS[c] & PS[rctoindex(r, c)]);   // J is the intersection of the three sets
        for (i = 1; i <= 9; i++) {
          if (j & (1 << i)) {

            P[r][c] = i;
            RS[r] &= (~(1 << i));             // Remove bit i from RS, CS and PS
            CS[c] &= (~(1 << i));
            PS[rctoindex(r, c)] &= (~(1 << i));

            if (Solve(r, c)) return 1;

            RS[r] |= (1 << i);                // Put bit i back into RS, CS and PS
            CS[c] |= (1 << i);
            PS[rctoindex(r, c)] |= (1 << i);
          }
        }
        P[r][c] = 0;
        return 0;
      }
      c++;
    }
    if (c == 9) { r++; c = 0; }
  }
  return 1;
}

Now we're talking speed improvements!

UNIX> g++ -O3 -o sudoku7 sudoku7.cpp
UNIX> time sh -c 'for i in 1 2 3 4 5 6 ; do sudoku7 < test_puzzle_$i.txt > /dev/null; done'
0.070u 0.000s 0:00.02 350.0%  0+0k 0+0io 0pf+0w
UNIX> 
The improvement comes because for small sets, bit operations are much faster than using balanced binary trees (which is how the STL implements sets).

My final version

Try sudoku8.cpp. I've made the following improvements here: That shaves another hundredth of a second off. I'm guessing that the overhead of launching the program and reading the puzzle is getting in the way, but I'm sick of twiddling with this......