CS302 Lecture Notes

CS302 Lecture Notes - Recursion Review

September, 2011
Latest Revision: September, 2021
James S. Plank
Directory: /home/plank/cs302/Notes/Recursion-Review

This is review and reinforcement of recursion. We'll go over three problems that involve recursion in varying levels of detail. For primary lecture notes on recursion, please see my CS202 Lecture notes on recursion.

Youtube video of me explaining the coins_exchange problem on 9/3/2020.

#1: Topcoder SRM 355, D2, 550-point problem: NoEights

As always, I don't re-post their problems. If their server is down, read the problem description from http://community.topcoder.com/stat?c=problem_statement&pm=7759&rd=10712. The gist is, given two numbers, l and h, determine the minimum number of digits that equal eight for all numbers between l and h (inclusive).

Examples:

l=1, h=10: The answer is zero, because every number but 8 has zero eights.
l=8, h=8: Duh.
l=83848, h=83888: The answer is two -- every number has to start with 838.
l=83848, h=84888: The answer is one -- every number has to start with 8, but there are numbers, like 84000, that have only one 8.

Header and Main

The class definition is in include/no_eights.hpp:

/* Definition of the NoEights class from Topcoder SRM 355, D2, 550 pointer. */

class NoEights {
  public:
    int smallestAmount(int low, int high);
};

I have a driver program in src/no_eights_main.cpp. If you give it 0, 1, 2 or 3 on the command line, it will do that Topcoder example. If you give it "-", then it will read l and h from standard input.

The makefile compiles src/no_eights_main.cpp with src/no_eights.cpp to make bin/no_eights:

UNIX> make clean
rm -f a.out obj/* bin/*
UNIX> make bin/no_eights
g++ -std=c++98 -O3 -Wall -Wextra -Iinclude -c -o obj/no_eights_main.o src/no_eights_main.cpp
g++ -std=c++98 -O3 -Wall -Wextra -Iinclude -c -o obj/no_eights.o src/no_eights.cpp
g++ -std=c++98 -O3 -Wall -Wextra -Iinclude -o bin/no_eights obj/no_eights_main.o obj/no_eights.o
UNIX>

Solving this with recursion

The brain dead way to do this is to iterate from l to h, counting eights. Unfortunately, the constraints on the problem say that l and h can be up to 2,000,000,000, so that's too slow.

The key insight is to look at l and h as strings with equal numbers of digits. Then, the common prefixes of l and h allow us to determine the minimum number of eights. The problem description says that h will be at most a 10-digit number, so simply convert both to 10-digit strings that represent the numbers with leading zeros.

In other words, if l equals 8 and h equals 20, then convert l to "0000000008" and h to "0000000020".

Now, look at the first digit of both numbers. Call them h[0] and l[0]. If both equal '8', then every number between h and l has to start with '8'. If we remove the '8' from both strings and solve the problem recursively, then our answer is one plus the answer of the recursive problem.

Instead, suppose they both do not equal '8', but they do equal each other. If we remove the digit from both numbers and solve it recursively, then we have the answer.

Suppose they do not equal each other, and h[0] does not equal '8'. Then, you know the number beginning with h[0] and having zero's in every other digit is between l and h, and it has zero 8's. You can return zero.

Suppose they do not equal each other, and h[0] equals '8'. Then, you know the number beginning with l[0] and having nines in every other digit is between l and h, and it has zero 8's. You can return zero again.

This maps itself to a straightforward solution, in src/no_eights.cpp:

int NE(const string &l, const string &h, size_t index)
{
  if (index == l.size()) return 0;     // Base case when we have no digits left.
  if (l[index] != h[index]) return 0;

  if (l[index] == '8') return 1 + (NE(l, h, index+1));
  return 0 + (NE(l, h, index+1));
}

int NoEights::smallestAmount(int low, int high)
{
  char b[20];
  string l, h;

  sprintf(b, "%010d", low);    // Conversion to 10 digit strings with leading zeros
  l = b;
  sprintf(b, "%010d", high);
  h = b;
  return NE(l, h, 0);
}

Now, you could have solved that with a for loop, but sometimes it's easier to think recursively. What's the running time? It's O(n), where n is the length of the string.

UNIX> bin/no_eights 0
0
UNIX> bin/no_eights 1
2
UNIX> bin/no_eights 2
1
UNIX> bin/no_eights 3
2
UNIX> echo 80888 80899 | bin/no_eights -
2
UNIX>

#2: Topcoder SRM 351, D1, 250-point problem: CoinsExchange

The problem description is available at http://community.topcoder.com/stat?c=problem_statement&pm=7773&rd=10675.

We are given six numbers: G1, S1 and B1, representing the number of gold, silver and bronze coins that we currently have, and G2, S2 and B2, representing the number of gold, silver and bronze coins that we want to have. We have exchange rates:

If you give the bank one gold, you get 9 silver
If you give the bank 11 silver, you get one gold
If you give the bank one silver, you get 9 bronze
If you give the bank 11 bronze, you get one silver

We are to return the minimum number of exchanges that we need to perform to get at least G2/S2/B2 from G1/S1/B1. Here are the examples:

Example   G1  S1  B1    G2  S2  B2   Answer
  0        1   0   0     0   0  81     10: One gold to 9 silver.  9 silver to 81 bronze.
  1        1 100  12     5  53  33      7: 44 silver to 4 gold.  3 silver to 27 bronze.
  2        1 100  12     5  63  33     -1: Impossible.
  3        5  10  12     3   7   9      0: Got already.

Header and Driver

As with the other topcoder problems in these lectures, I have a header and driver. The header is in include/coins_exchange.hpp:

/* Header file for Topcoder SRM 351, D1, 250-Pointer: CoinsExchange */

#include <string>

class CoinsExchange {
  public:
    int countExchanges(int G1, int S1, int B1, int G2, int S2, int B2);

  protected:               /* I've added this variable to help print out the state. */
    std::string nest;
};

And the driver is src/coins_exchange_main.cpp. You can give it the coins on standard input if you give it a dash on the command line. Otherwise, you can give it example numbers on the command line.

Working Up To A Solution

In 2020, I messed up this program for something like the 10th time in the 13 times that I have taught this lecture. So, yet another rewrite -- I vow never to mess this program up again!!!

I have two versions of this program. The version in src/coins_exchange.cpp is commented, but doesn't print out anything. The version in src/coins_exchange_print.cpp prints out what it's doing. I suggest, when you are studying this program, you work up to a solution with me in the way I'm doing below. You can assure yourself of what's going on with the version that prints information.

The approach that I take is to break the problem into sub-problems and then use recursion to solve the sub problems. Each time I do so, I make the sub-problems easier. That's a lot easier than trying to think things like: "If G2 is greater than G1 but 11*(G2-G1) is greater than (S1-S2), then.....".

I'm going to show you here how I solve the problem. To start with, I'm going to declare some extra variables which help me think through the problem. These are the excess gold, silver and bronze that I have, and the deficits of gold, silver and bronze that I have. I'm going to start by calculating them, and if I have no deficits, then I'm already done -- I'll return zero. Otherwise, I'll return -1.

int CoinsExchange::countExchanges(int G1, int S1, int B1, int G2, int S2, int B2)
{
  int gold_excess, silver_excess, bronze_excess;      // Excess coins
  int gold_deficit, silver_deficit, bronze_deficit;   // Coins where I have a deficit

  /* Determine our deficit coins and our excess coins. */

  gold_deficit = (G2 - G1 > 0) ? G2 - G1 : 0;
  gold_excess  = (G1 - G2 > 0) ? G1 - G2 : 0;

  silver_deficit = (S2 - S1 > 0) ? S2 - S1 : 0;
  silver_excess  = (S1 - S2 > 0) ? S1 - S2 : 0;

  bronze_deficit = (B2 - B1 > 0) ? B2 - B1 : 0;
  bronze_excess  = (B1 - B2 > 0) ? B1 - B2 : 0;

  /* Base case -- if there are no deficits, then return 0. */

  if (gold_deficit == 0 && silver_deficit == 0 && bronze_deficit == 0) {
    printf("Our needs are met -- returning 0\n");
    return 0;
  }

  /* If we have reached this point, then it's impossible, or we haven't implemented it. */

  return -1;
}

We'll test this four times -- comments inline:

UNIX> echo 10 20 30   5 5 30 | bin/coins_exchange -       # We have enough of everything
0
UNIX> echo 10 20 30   11 5 30 | bin/coins_exchange -      # Too little gold
-1
UNIX> echo 10 20 30   5 21 30 | bin/coins_exchange -      # Too little silver
-1
UNIX> echo 10 20 30   5 5 31 | bin/coins_exchange -       # Too little bronze
-1
UNIX>

Start with Gold

Now, we'll start with gold. If we have a deficit of gold, then we'll need to get it from silver. We'll introduce a new variable, need_silver to represent how much silver we need (it will be 11 times our gold deficit). And we'll call countExchanges() recursively, removing gold from the equation and adding need_silver to our silver needs. That will return how many transactions were needed to make sure we have enough silver (or that it was impossible). We add gold_deficit to that to account for the exchanges of silver to gold, and return the sum.

Here's the code:

  /* First issue -- if we need gold, we have to get it from silver.
     So, calculate how much silver we need, and make a recursive call to
     see how many transactions are needed to get it.  If it's possible,
     then add the number of transactions for the gold (which is the number of
     gold we need) and return it.  */

  if (gold_deficit > 0) {
    need_silver = 11 * gold_deficit;
    rv = countExchanges(0, S1, B1, 0, S2+need_silver, B2);
    if (rv == -1) return -1;
    return rv+gold_deficit;
  }
}

Let's test:

UNIX> echo 10 111 0   20 0 0 | bin/coins_exchange -    # We have enough silver.
10
UNIX> echo 10 109 0   20 0 0 | bin/coins_exchange -    # We don't have enough silver.
-1
UNIX>

If you want more detail on the recursion, call bin/count_exchanges_print. This tells you about the recursive calls:

UNIX> echo 10 111 0   20 0 0 | bin/coins_exchange_print -
We have:  G1:  10   S1: 111   B1:   0
We want:  G2:  20   S2:   0   B2:   0

Our gold deficit is 10 and we need 110 silver.  Making a recursive call.

We have:  G1:   0   S1: 111   B1:   0    # The recursive call removes all of the gold,
We want:  G2:   0   S2: 110   B2:   0    # and add's the silver needed to the silver

Our needs are met -- returning 0         # Since we have enough silver, we can return with zero exchanges

We recursively got 110 silver to convert to 10 gold.  RV=0.  Returning 0+10 = 10

10
UNIX> echo 10 109 0   20 0 0 | bin/coins_exchange_print -
We have:  G1:  10   S1: 109   B1:   0
We want:  G2:  20   S2:   0   B2:   0

Our gold deficit is 10 and we need 110 silver.  Making a recursive call.

We have:  G1:   0   S1: 109   B1:   0     # Now, in the recursive call, we don't have enough silver.
We want:  G2:   0   S2: 110   B2:   0

It's impossible
-1
UNIX>

Bronze

Let's do the same thing with bronze -- if we have a deficit of bronze, then we need to get it from silver. Integer division helps us here. Since we get 9 bronze for each silver, we need:

1 silver to get between 1 and 9 bronze.
2 silver to get between 10 and 18 bronze.
3 silver to get between 19 and 27 bronze.
And so on.

If you think about it a bit, you calculate your silver needs, by adding 8 to the bronze and dividing by nine using integer division. The bronze code looks similar to the gold code, except the number of transactions is the number of silver that you convert:

  /* Second issue -- if we need bronze, then we also have to get it from silver.
     So, calculate how much silver we need, and make a recursive call to
     see how many transactions are needed to get it.  If it's possible,
     then add the number of transactions for the bronze (which is the number of
     silver that we exchanged) and return it.  */

  if (bronze_deficit > 0) {
    need_silver = (bronze_deficit + 8) / 9;
    rv = countExchanges(G1, S1, 0, G2, S2+need_silver, 0);
    if (rv == -1) return -1;
    return rv+need_silver;
  }

Test. If you want more detail, This link has the calls with printing.

UNIX> echo 0 5 0   0 0 44 | bin/coins_exchange -     # We get 45 bronze for 5 silver.
5
UNIX> echo 0 5 0   0 0 45 | bin/coins_exchange -     # This works, too
5
UNIX> echo 0 5 0   0 0 46 | bin/coins_exchange -     # Now, we need 6 silver.
-1
UNIX>

Silver

Now we need to work on silver. This is a little more subtle. Since we can get more silver with gold, we first see if we can satisfy silver with gold. If we can, we're done. If we can't, then we need to get more silver from bronze. We do that recursively:

  /* If we have reached this point, we need silver.  If we have excess gold,
     let's get as much silver as we can from gold.  If that solves the problem,
     then we return.  If it doesn't then we recursively solve it, taking gold
     out of the equation. */

  if (gold_excess > 0) {
    need_gold = (silver_deficit + 8) / 9;
    if (need_gold <= gold_excess) return need_gold;
    rv = countExchanges(0, S1, B1, 0, S2-gold_excess*9, B2);
    if (rv == -1) return -1;
    return gold_excess+rv;
  }
}

Test (with printing is in this link):

UNIX> echo 10 0 0   0 90 0 | bin/coins_exchange -         # Get 90 silver from 10 gold.
10
UNIX> echo 10 0 0   0 0 810 | bin/coins_exchange -        # Bronze makes a recursive call for
100                                                       # 90 silver, which takes 10 gold.
UNIX> echo 10 0 0   0 91 0 | bin/coins_exchange -         # This one is impossible
-1
UNIX> echo 10 0 0   0 0 811 | bin/coins_exchange -        # As is this one.
-1
UNIX>

And finally, the last case is getting silver from bronze. If we don't have enough bronze, then we fail.

  /* Now, if we have reached this point, we need silver and we have no gold.  We have
     to get it from bronze. */


  need_bronze = silver_deficit * 11;
  if (need_bronze <= bronze_excess) return silver_deficit;

  /* If we have reached this point, then it's impossible. */
  return -1;
}

Test -- we'll do all of the topcoder tests, too! (with printing here).

UNIX> echo 0 0 110    0 10 0  | bin/coins_exchange -
10
UNIX> echo 0 0 109    0 10 0  | bin/coins_exchange -
-1
UNIX> echo 0 0 121    1 0 0  | bin/coins_exchange -
12
UNIX> echo 0 0 120    1 0 0  | bin/coins_exchange -
-1
UNIX> bin/coins_exchange 0
10
UNIX> bin/coins_exchange 1
7
UNIX> bin/coins_exchange 2
-1
UNIX> bin/coins_exchange 3
0
UNIX>

We're done -- my goal here was to show you how recursion can help you break a problem into subproblems that you can solve recursively.

#3: Solving Sudoku Puzzles

I assume everyone knows what sudoku is, but if you don't, read Wikipedia's page. Since the problems are pretty small, it's very easy to write a brain-dead recursive Sudoku solver, and for hard problems, it's easier to write the program than it is to solve the puzzle by hand! I go over this program rather quickly, because we've done Sudoku in CS202 as well. It's a nice use of recursion.

We'll build a solution. First have to read a problem in -- I'll do that from standard input -- numbers are '1' through '9', empty cells are '-' and everything else is ignored. I store a puzzle in a vector of nine strings, each with nine characters. I do this in a Read() method of a class called Sudoku, and I also implement a Print() method in src/sudoku1.cpp:

class Sudoku {
  public:
    void Read();            // Read from standard input
    void Print() const;     // Print to standard output
  protected:
    vector <string> puzzle; // Hold the puzzle in a vector of 9 strings
};

void Sudoku::Read()        
{
  int i, j;
  char c;

  puzzle.clear();
  puzzle.resize(9);

  for (i = 0; i < 9; i++) {  // Read the puzzle, error checking.
    for (j = 0; j < 9; j++) {
      do {
        if (!(cin >> c)) { 
          cerr << "Not enough cells.\n";
          exit(1);
        }
      } while (isspace(c));
      if (c != '-' && (c < '1' || c > '9')) {
        cerr << "Bad character " << c << endl;
        exit(1);
      }
      puzzle[i].push_back(c);
    }
  }
}

void Sudoku::Print() const
{
  int i, j;

  for (i = 0; i < puzzle.size(); i++) {
    for (j = 0; j < puzzle[i].size(); j++) {
      cout << puzzle[i][j];
      if (j == 2 || j == 5) cout << " ";
    }
    cout << endl;
    if (i == 2 || i == 5) cout << endl;
  }
}
  
int main()
{
  Sudoku S;

  S.Read();
  S.Print();
}

I have the example from the Wikipedia page in two files: txt/sudex1.txt and txt/sudex2.txt. They differ in the amount of whitespace. However, when the program reads them in, they produce the same output:

UNIX> make bin/sudoku1
g++ -std=c++98 -O3 -o bin/sudoku1 src/sudoku1.cpp 
UNIX> cat txt/sudex1.txt
53--7----
6--195---
-98----6-
8---6---3
4--8-3--1
7---2---6
-6----28-
---419--5
----8--79
UNIX> bin/sudoku1 < txt/sudex1.txt
53- -7- ---
6-- 195 ---
-98 --- -6-

8-- -6- --3
4-- 8-3 --1
7-- -2- --6

-6- --- 28-
--- 419 --5
--- -8- -79
UNIX>

UNIX> cat txt/sudex2.txt
5 3 -   - 7 -   - - - 
6 - -   1 9 5   - - - 
- 9 8   - - -   - 6 - 

8 - -   - 6 -   - - 3 
4 - -   8 - 3   - - 1 
7 - -   - 2 -   - - 6 

- 6 -   - - -   2 8 - 
- - -   4 1 9   - - 5 
- - -   - 8 -   - 7 9 
UNIX> bin/sudoku1 < txt/sudex2.txt
53- -7- ---
6-- 195 ---
-98 --- -6-

8-- -6- --3
4-- 8-3 --1
7-- -2- --6

-6- --- 28-
--- 419 --5
--- -8- -79
UNIX>

As a next step, we implement methods to check whether rows, columns or panels are valid. They are straightforward. In src/sudoku2.cpp, we check to see whether the input matrix is indeed valid.

In class, I pause here and ask you to write the row_ok() method. Read this page for a discussion of various bad ways to write row_ok().

class Sudoku {
  public:
    void Read();                        // Read from standard input
    void Print() const;                 // Print to standard output
    int row_ok(int r) const;            // Test row r for correctness
    int column_ok(int c) const;         // Test cols r for correctness
    int panel_ok(int pr, int pc) const; // Test panel pr/pc (both 0,1,2) for correctness
  protected:
    vector <string> puzzle;             // Hold the puzzle in a vector of 9 strings
};

int Sudoku::row_ok(int r) const
{
  vector <int> checker;     /* Use this to make sure no digit is set twice. */
  int c;

  checker.clear();
  checker.resize(10, 0);
  for (c = 0; c < 9; c++) {
    if (puzzle[r][c] != '-') {
      if (checker[puzzle[r][c]-'0']) return 0;
      checker[puzzle[r][c]-'0'] = 1;
    }
  }
  return 1;
}
   
int Sudoku::column_ok(int c) const
{
  vector <int> checker;
  int r;

  checker.resize(10, 0);
  for (r = 0; r < 9; r++) {
    if (puzzle[r][c] != '-') {
      if (checker[puzzle[r][c]-'0']) return 0;
      checker[puzzle[r][c]-'0'] = 1;
    }
  }
  return 1;
}

int Sudoku::panel_ok(int pr, int pc) const
{
  vector <int> checker;
  int r, c;
  int i, j;

  checker.resize(10, 0);
  for (i = 0; i < 3; i++) {
    for (j = 0; j < 3; j++) {
      r = pr*3+i;
      c = pc*3+j;
      if (puzzle[r][c] != '-') {
        if (checker[puzzle[r][c]-'0']) return 0;
        checker[puzzle[r][c]-'0'] = 1;
      }
    }
  }
  return 1;
}

int main()
{
  int r, c;
  Sudoku S;

  S.Read();

  for (r = 0; r < 9; r++) if (!S.row_ok(r)) printf("Bad row %d\n", r);
  for (c = 0; c < 9; c++) if (!S.column_ok(c)) printf("Bad col %d\n", c);
  for (r = 0; r < 3; r++) for (c = 0; c < 3; c++) {
    if (!S.panel_ok(r, c)) printf("Bad panel %d %d\n", r, c);
  }
}

I have some example puzzles (txt/sudex3.txt, txt/sudex4.txt & txt/sudex5.txt) with errors: the program correctly identifies them:

UNIX> make bin/sudoku2
g++ -std=c++98 -O3 -o bin/sudoku2 src/sudoku2.cpp 
UNIX> bin/sudoku2 < txt/sudex3.txt
Bad row 3
UNIX> bin/sudoku2 < txt/sudex4.txt
Bad col 7
UNIX> bin/sudoku2 < txt/sudex5.txt
Bad panel 1 2
UNIX>

Now, this gives us all the pieces to write a really brain-dead recursive solver. What it does is the following:

It finds an empty cell.
It tests every value from 1 to 9 in the cell.
If the value yields a legal row, column and panel, it calls itself recursively.
If the recursive call reports that it failed, then it will remove the value from the cell, and test the next value. If there is no "next value", then it returns failure.

If it is called on a filled puzzle, we're done -- we print it and exit, and don't return to our callers. The code is in src/sudoku3.cpp

class Sudoku {
  public:
    void Read();                        // Read from standard input
    void Print() const;                 // Print to standard output
    void Solve();                       // Solve the problem
    int row_ok(int r) const;            // Test row r for correctness
    int column_ok(int c) const;         // Test cols r for correctness
    int panel_ok(int pr, int pc) const; // Test panel pr/pc (both 0,1,2) for correctness
  protected:
    vector <string> puzzle;             // Hold the puzzle in a vector of 9 strings
};

void Sudoku::Solve()
{
  int r, c, i;

  for (r = 0; r < 9; r++) {
    for (c = 0; c < 9; c++) {
      if (puzzle[r][c] == '-') {         /* Find the first empty cell. */
        for (i = '1'; i <= '9'; i++) {   /* Try every digit. */
          puzzle[r][c] = i;              /* If the digit is legal, call Solve() recursively */
          if (row_ok(r) && column_ok(c) && panel_ok(r/3, c/3)) Solve();
        }
        puzzle[r][c] = '-';
        return;
      }
    }
  }
  Print();             /* If we get here, the puzzle has been solved. */
  exit(0);
}

It works on our example, pretty quickly (at this point, I'm assuming that you have made all of the executables).

UNIX> time bin/sudoku3 < txt/sudex1.txt
534 678 912
672 195 348
198 342 567

859 761 423
426 853 791
713 924 856

961 537 284
287 419 635
345 286 179
0.043u 0.001s 0:00.04 100.0%	0+0k 0+0io 0pf+0w
UNIX>

I find that a little depressing, actually, that a program that brain-dead can solve a puzzle in seconds that may take me 10+ minutes of logic and head-scratching.

However, if you're like me, it seems like we could speed this up. Let's explore.

Speeding it up a little

First, we need a way to test speed. What I've done is grab six "Evil" puzzles from a web site. They are in txt/test_puzzle_1.txt, txt/test_puzzle_2.txt, txt/test_puzzle_3.txt, txt/test_puzzle_4.txt, txt/test_puzzle_5.txt and txt/test_puzzle_6.txt. I first compile the program using optimization, and then I time it on the six programs with a fancy shell command:

UNIX> g++ -O3 -o bin/sudoku3 src/sudoku3.cpp
UNIX> time sh -c 'for i in 1 2 3 4 5 6 ; do bin/sudoku3 < txt/test_puzzle_$i.txt > /dev/null; done'
0.430u 0.000s 0:00.42 102.3%  0+0k 0+0io 0pf+0w
UNIX>

Roughly 0.07 seconds for each test. What's one easy way to speed this up? Well, it seems a bit inefficient to look for a blank space from the beginning each time we call Solve(). Let's instead parameterize Solve() with the row and column, and then when we call it recursively, we give it the next cell. The updated Solve() is in src/sudoku4.cpp:

void Sudoku::Solve(int r, int c)
{
  int i;

  if (c == 9) { c = 0; r++; }

  while (r < 9) {
    if (puzzle[r][c] == '-') {
      for (i = '1'; i <= '9'; i++) {
        puzzle[r][c] = i;
        if (row_ok(r) && column_ok(c) && panel_ok(r/3, c/3)) Solve(r, c+1);
      }
      puzzle[r][c] = '-';
      return;
    }
    c++;
    if (c == 9) { c = 0; r++; }
  }
  Print();
  exit(0);
}

We first call it with Solve(0, 0). Does it speed things up? A little:

UNIX> time sh -c 'for i in 1 2 3 4 5 6 ; do bin/sudoku4 < txt/test_puzzle_$i.txt > /dev/null; done'
0.420u 0.000s 0:00.40 105.0%  0+0k 0+0io 0pf+0w
UNIX>

I'm surprised that it doesn't speed matters up more. Whatever. Let's try something more drastic. For each row, column and panel, let's keep a set of valid numbers that can be entered. Then, we have two potential speed-ups. First, when we want to test an empty cell, we can traverse the legal values for the cell's row, then test the column & panel sets to see if the value is legal for those too. If so, we can call Solve() recursively. That eliminates the calls to row_ok(), column_ok() and panel_ok().

The code is a bit icky -- it's in src/sudoku5.cpp. First, here's the updated class definition:

typedef set <int> ISet;
typedef vector <ISet> VISet;

class Sudoku {
  public:
    vector <string> puzzle;             // Hold the puzzle in a vector of 9 strings
    void Read();                        // Read from standard input
    void Print() const;                 // Print to standard output
    void Solve(int r, int c);           // Solve starting at the given row/col
    int row_ok(int r) const;            // Test row r for correctness
    int column_ok(int c) const;         // Test cols r for correctness
    int panel_ok(int pr, int pc) const; // Test panel pr/pc (both 0,1,2) for correctness
    vector <ISet> vrows;       // Sets of legal values for each row.
    vector <ISet> vcols;       // Sets of legal values for each row.
    vector <VISet> vpanels;    // Sets of legal values for each panel.
};

And here's Solve():

void Sudoku::Solve(int r, int c)
{
  int i, j, e;
  vector <int> to_try;
  ISet::iterator rit, cit, pit;

  /* At the beginning, first put all values into the three vectors of sets: */

  if (r == 0 && c == 0) {
    vrows.resize(9);
    vcols.resize(9);
    vpanels.resize(3);
    for (i = 0; i < 3; i++) vpanels[i].resize(3);

    for (i = 0; i < 9; i++) {
      for (j = '1'; j <= '9'; j++) {
        vrows[i].insert(j);
        vcols[i].insert(j);
        vpanels[i/3][i%3].insert(j);
      }
    }
    
    /* Then, run through each row, column and panel of the puzzle,
       and remove values from the sets. */

    for (i = 0; i < 9; i++) {
      for (j = 0; j < 9; j++) {
        if (puzzle[i][j] != '-') {
          e = puzzle[i][j];
          vrows[i].erase(vrows[i].find(e));
          vcols[j].erase(vcols[j].find(e));
          vpanels[i/3][j/3].erase(vpanels[i/3][j/3].find(e));
        }
      }
    }
  }
      
  if (c == 9) { c = 0; r++; }

  /* Now, instead of trying every value and testing for legality, we instead create
     a vector from all of the legal values in the row.  We traverse that vector, and
     if a value is legal in the column and panel, then we add it to the puzzle and
     remove it from the three sets.  Then we make the recursive call, and add the value
     back to the sets.  This code is kind of a pain, isn't it? */

  while (r < 9) {
    if (puzzle[r][c] == '-') {
      for(rit = vrows[r].begin(); rit != vrows[r].end(); rit++) to_try.push_back(*rit);
      for (i = 0; i < (int) to_try.size(); i++) {
        e = to_try[i];
        cit = vcols[c].find(e);
        if (cit != vcols[c].end()) {
          pit = vpanels[r/3][c/3].find(e);
          if (pit != vpanels[r/3][c/3].end()) {
            rit = vrows[r].find(e);
            vrows[r].erase(rit);
            vcols[c].erase(cit);
            vpanels[r/3][c/3].erase(pit);
            puzzle[r][c] = e;
            Solve(r, c+1);
            vrows[r].insert(e);
            vcols[c].insert(e);
            vpanels[r/3][c/3].insert(e);
          }
        }
      }
      puzzle[r][c] = '-';
      return;
    }
    c++;
    if (c == 9) { c = 0; r++; }
  }
  Print();
  exit(0);
}

Although a bit spindly, the code is straightforward. The only subtlety that I see is using to_try. Why did I do this? Why didn't I simply use rit to traverse vrows[r]? The reason is that I potentially erase rit inside the loop -- once I do that, I invalidate rit, which would be problematic inside a for loop that uses rit. Yes, I could store rit++ and change the loop -- that's probably faster; however, using to_try doesn't seem like a bad alternative.

Is it faster? Let's see:

UNIX> time sh -c 'for i in 1 2 3 4 5 6 ; do bin/sudoku5 < txt/test_puzzle_$i.txt > /dev/null; done'
0.210u 0.010s 0:00.18 122.2%  0+0k 0+0io 0pf+0w
UNIX>

Well, it runs in 50 percent of the time of sudoku4, so I guess I should be happy. I'm not really, but I'll pretend. There are lots of things to try -- for example, src/sudoku6.cpp creates to_try from the smallest of vrows[r], vcols[c] and vpanels[r/3][c/3]:

UNIX> time sh -c 'for i in 1 2 3 4 5 6 ; do bin/sudoku6 < txt/test_puzzle_$i.txt > /dev/null; done'
0.190u 0.000s 0:00.17 111.7%  0+0k 0+0io 0pf+0w
UNIX>

Let's try something else....

Using integers and bit operations as sets

Before you read this, make sure you understand bit operations. Please see the lecture notes on bit operations if you need to brush up.

As illustrated in those lectures, you can use integers and bit operations to represent sets. This actually simplifies the code above. In src/sudoku7.cpp, we do this. (BTW, this code and the code in src/sudoku8.cpp are of an older style, and a little different than the code above -- you shouldn't have a hard time navigating them).

Actually, we do quite a bit more. First, we use the numbers 1 through 9 in the puzzle rather than their characters. We represent '-' with 0. The sets are now vectors of integers rather than sets. We also have the panel set be a flat vector of nine elements. We use the procedure rctoindex() to convert row and column indices to a single index for this vector.

Now, to create the initial sets for rows, columns and panels, we do the following: We first create sets of the numbers that are in each row/column/panel, and then we take their complement so that we have sets of the numbers that are not in each row/column/panel:

// In the constructor:

  RS.resize(9, 0);
  CS.resize(9, 0);
  PS.resize(9, 0);

  for (i = 0; i < 9; i++) {
    for (j = 0; j < 9; j++) {
      if (P[i][j] != 0) {
        RS[i] |= (1 << P[i][j]);
        CS[j] |= (1 << P[i][j]);
        PS[rctoindex(i, j)] |= (1 << P[i][j]);
      }
    }
  }

  for (i = 0; i < 9; i++) {
    RS[i] = ~RS[i];
    CS[i] = ~CS[i];
    PS[i] = ~PS[i];
  }
}

Then the solver takes the intersection of the three sets, and only puts elements that are in that intersection into the recursive tester:

int Sudoku::Solve(int r, int c)
{
  int i, j;

  while (r < 9) {
    while (c < 9) {
      if (P[r][c] == 0) {
        j = (RS[r] & CS[c] & PS[rctoindex(r, c)]);   // J is the intersection of the three sets
        for (i = 1; i <= 9; i++) {
          if (j & (1 << i)) {

            P[r][c] = i;
            RS[r] &= (~(1 << i));                    // Remove bit i from RS, CS and PS
            CS[c] &= (~(1 << i));
            PS[rctoindex(r, c)] &= (~(1 << i));

            if (Solve(r, c)) return 1;

            RS[r] |= (1 << i);                       // Put bit i back into RS, CS and PS
            CS[c] |= (1 << i);
            PS[rctoindex(r, c)] |= (1 << i);
          }
        }
        P[r][c] = 0;
        return 0;
      }
      c++;
    }
    if (c == 9) { r++; c = 0; }
  }
  return 1;
}

Now we're talking speed improvements!

UNIX> g++ -O3 -o bin/sudoku7 src/sudoku7.cpp
UNIX> time sh -c 'for i in 1 2 3 4 5 6 ; do bin/sudoku7 < txt/test_puzzle_$i.txt > /dev/null; done'
0.070u 0.000s 0:00.02 350.0%  0+0k 0+0io 0pf+0w
UNIX>

The improvement comes from the following reason -- for small sets, bit operations are much faster than using balanced binary trees (which is how the STL implements sets).

My final version (I don't do this in class)

Try src/sudoku8.cpp. I've made the following improvements here:

I've added an array called vecs, which I index by the intersection of RS/CS/PS. It contains an array of the elements of the set, terminated by zero. Therefore, if the intersection is 0x92, meaning that elements 1, 4 and 7 are in the intersection, then vecs[0x92] = { 1, 4, 7, 0 }. This saves me time figuring out what elements are in the set.
I've added a vector called Empty_Cells, which contains the row and column indices of the empty cells in the puzzle. That way I don't have to waste time looking at non-empty cells.
I precalculate rctoindex(r, c) in case the compiler can't figure it out.

That shaves another hundredth of a second off. I'm guessing that the overhead of launching the program and reading the puzzle is getting in the way, but I'm sick of twiddling with this......