The following are topcoder problems that use enumeration to solve them. If you are finding enumeration difficult, then by all means, practice with these.
Lets' do a small example. Suppose you want to enumerate all 2-digit numbers that contain the digits 0, 1 and 2. In this example, n is 3, and l is 2, so you need to enumerate the numbers between 0 and 8 (which is 32-1), and convert each of those into the proper two digit number using div and mod. Here's a table:
i i%n i/n (i/n)%n Result 0 0 0 0 00 1 1 0 0 10 2 2 0 0 20 3 0 1 1 01 4 1 1 1 11 5 2 1 1 21 6 0 2 2 02 7 1 2 2 12 8 2 2 2 22The program src/divmod.cpp takes two command line arguments l and n, and enumerates all strings of length l that are composed of the first n letters of the alphabet. Here, I show the relevant code. All variables are ints.
/* Calculate n^l. This is inefficient, but since l is small, it's ok. */ top = 1; for (i = 0; i < l; i++) top *= n; /* Enumerate the numbers from 0 to n^l-1, and for each of these numbers extract each digit when the number is considered in base n. We do that by taking the number mod n, and then dividing the number by n. */ for (i = 0; i < top; i++) { j = i; for (k = 0; k < l; k++) { digit = j % n; j /= n; printf("%c", 'a'+digit); } printf("\n"); } return 0; } |
Let's do the example above:
UNIX> bin/divmod usage: divmod l n UNIX> bin/divmod 2 3 aa ba ca ab bb cb ac bc cc UNIX>In class, I did an example where n equals 5 and l equals four, and I used Unix tools to prove that the answer had to be correct:
UNIX> bin/divmod 4 5 | head // We're just taking a look here. aaaa baaa caaa daaa eaaa abaa bbaa cbaa dbaa ebaa UNIX> bin/divmod 4 5 | tail // And here. adee bdee cdee ddee edee aeee beee ceee deee eeee UNIX> bin/divmod 4 5 | wc // n^l is 625, so this prints the right number of lines. 625 625 3125 UNIX> bin/divmod 4 5 | sort -u | wc // This ensures that there are no duplicate lines. 625 625 3125 UNIX> bin/divmod 4 5 | sed 's/[a-e]/x/g' | sort -u // This ensures that the only characters printed are a-e and x. xxxx UNIX> bin/divmod 4 5 | sed 's/[a-e]/y/g' | sort -u // This ensures that there were in fact no x's. yyyy UNIX>
You'll note, this is the same problem as above -- you want to generate all l-length vectors v that contain numbers between zero and two. While we can use the same method as above with div and mod, when you are dealing with zeros and ones exclusively, you can use bit arithmetic to do the same task more quickly.
We'll motivate with an example. Suppose I have four people, Larry, Curly, Moe and Shemp, and I want to enumerate all possible ways that I can make a team from them, where a team is a collection of one or more people. Each potential team may be represented by a bit string as defined above, and as long as the bit string has a one in it, it represents a valid team. Here are all teams and their representative strings:
1000 Larry 0100 Curly 1100 Larry Curly 0010 Moe 1010 Larry Moe 0110 Curly Moe 1110 Larry Curly Moe 0001 Shemp 1001 Larry Shemp 0101 Curly Shemp 1101 Larry Curly Shemp 0011 Moe Shemp 1011 Larry Moe Shemp 0111 Curly Moe Shemp 1111 Larry Curly Moe Shemp |
Enumerating these bit strings can be done as in the previous section: l equals 4 and n equals 2. Again, instead of using div and mod, we can use bit arithmetic. First, 2l may be calculated with bit arithmetic as (1 << l). Next, if you want to see if the j-th digit of the bit string s is one, test whether (s & (1 << j)) is non-zero. This gives us all we need to know to enumerate sets.
For example, the program src/gen-teams.cpp reads in names from standard input, and then generates all possible teams from those names. It is what I used to generate the teams above.
int main() { vector <string> people; string s; int i, j; while (cin >> s) people.push_back(s); if (s.size() > 30) { cerr << "Sorry, not generating more than 2^30 teams\n"; exit(1); } for (i = 1; i < (1 << people.size()); i++) { for (j = 0; j < people.size(); j++) { // Print the bit strings printf("%c", (i & (1 << j)) ? '1' : '0'); } for (j = 0; j < people.size(); j++) { // Print the teams if (i & (1 << j)) printf(" %s", people[j].c_str()); } printf("\n"); } exit(0); } |
There's not much subtle here -- once you know how to generate 2l and to test whether digit j is equal to one using bit arithmetic, the rest falls out naturally.
I'm reiterating here, but once again, what we are doing is representing sets using integers and bits rather than the STL. When I want to represent a set of distinct numbers between 0 and 31. I can do it using a 32-element bit string, where bit j is one if j is an element of the set. Since ints are 32-bits, I can represent each set with one integer. That's nice. I can represent a set of distinct numbers between 0 and 63 with a long long. That can also be nice.
A good example Topcoder problem that you can solve with a power set enumeration is SRM 604, D2, 500-pointer (PowerOfThreeEasy). I have lecture notes for that here, and I go over the program in class.
A second good Topcoder problem that uses this kind of enumeration is SRM 489, D2, 500-point problem. (Problem description: http://community.topcoder.com/stat?c=problem_statement&pm=11191&rd=14242). This one is a little harder. I have lecture notes about solving that problem here.
Instead, you can often write a recursive function to help you enumerate. An example is in src/gen-3-teams.cpp. This program generates all three-person teams from a collection of people entered on standard input. The program maintains two variables: people, which is the vector of people, and team, which is a team that gets incrementally constructed. The heart of the program is GenTeams(int index, int npeople), which is a recursive method that builds all possible teams. Its two parameters are as follows:
GenTeams() first checks base cases -- if npeople is zero, then there are no people left to put on teams -- you can print teams and return. Next, check to see if there are enough people left in people to make the team. If not, return. Otherwise, you do two recursive calls. The first puts people[index] on the team and calls GenTeams(index+1, npeople-1). The second skips people[index] and simply calls GenTeams(index+1, npeople). In other words, you are enumerating all teams with people[index] on them, and all teams without people[index].
Here is the code:
class People { public: vector <string> people; vector <string> team; void GenTeams(int index, int npeople); }; void People::GenTeams(int index, int npeople) { int i; /* Base case -- if there are no more people to add, print out the team and return */ if (npeople == 0) { cout << team[0]; for (i = 1; i < team.size(); i++) cout << " " << team[i]; cout << endl; return; } /* This is a second base case -- if there are fewer people left to add than there are places left on the team, then it's impossible to finish, so simply return. Ask yourself why this is better than testing whether index is equal to people.size(), and returning if so. */ if (npeople > people.size() - index) return; /* Now, put the person in people[index] onto the team, and call GenTeams() recursively. Afterwards, take the person off of the team. */ team.push_back(people[index]); GenTeams(index+1, npeople-1); team.pop_back(); /* Finally, call GenTeams() recursively without putting people[index] on the team. */ GenTeams(index+1, npeople); } int main() { People P; string s; int i, j; while (cin >> s) P.people.push_back(s); P.GenTeams(0, 3); } |
Trace through the code if you find this confusing -- copy the code and put in some extra print statements.
UNIX> echo Larry Curly Moe Shemp | bin/gen-3-teams Larry Curly Moe Larry Curly Shemp Larry Moe Shemp Curly Moe Shemp UNIX> echo Larry Curly Moe Shemp Baby-Daisy | bin/gen-3-teams Larry Curly Moe Larry Curly Shemp Larry Curly Baby-Daisy Larry Moe Shemp Larry Moe Baby-Daisy Larry Shemp Baby-Daisy Curly Moe Shemp Curly Moe Baby-Daisy Curly Shemp Baby-Daisy Moe Shemp Baby-Daisy UNIX>
A very similar recursion may be used to generate all permutations of a collection of elements. In the example, I again read in names, and this time print out all permutations. The structure of the enumeration is again a recursion, which works as follows (in src/gen-permutations.cpp):
class People { public: vector <string> people; void GenPermutations(int index); }; void People::GenPermutations(int index) { int i; string tmp; /* Base case -- we're done - print out the vector */ if (index == people.size()) { cout << people[0]; for (i = 1; i < people.size(); i++) cout << " " << people[i]; cout << endl; return; } /* Otherwise, for each element of the vector, swap it with the element in index, call GenPermutations() recursively on the remainder of the vector, and then swap it back. */ for (i = index; i < people.size(); i++) { tmp = people[i]; /* Swap people[index] with people[i] */ people[i] = people[index]; people[index] = tmp; GenPermutations(index+1); tmp = people[i]; /* Swap back */ people[i] = people[index]; people[index] = tmp; } } |
How many lines are there if you are permuting n items? Well, there are n ways to choose the first item, then n-1 to choose the second, etc. So, there are (n!) ways to permute n items, which will lead to n! lines of output:
UNIX> echo Larry Curly Moe | bin/gen-permutations Larry Curly Moe Larry Moe Curly Curly Larry Moe Curly Moe Larry Moe Curly Larry Moe Larry Curly UNIX> echo 1 2 3 4 | bin/gen-permutations 1 2 3 4 1 2 4 3 1 3 2 4 1 3 4 2 1 4 3 2 1 4 2 3 2 1 3 4 2 1 4 3 2 3 1 4 2 3 4 1 2 4 3 1 2 4 1 3 3 2 1 4 3 2 4 1 3 1 2 4 3 1 4 2 3 4 1 2 3 4 2 1 4 2 3 1 4 2 1 3 4 3 2 1 4 3 1 2 4 1 3 2 4 1 2 3 UNIX> echo 1 2 3 | bin/gen-permutations | wc 6 18 36 UNIX> echo 1 2 3 4 | bin/gen-permutations | wc 24 96 192 UNIX> echo 1 2 3 4 5 | bin/gen-permutations | wc 120 600 1200 UNIX> echo 1 2 3 4 5 6 | bin/gen-permutations | wc 720 4320 8640 UNIX> echo 1 2 3 4 5 6 7 | bin/gen-permutations | wc 5040 35280 70560 UNIX>
For example, in the gen-3-teams problem, you have n people and you want to enumerate the number of three-person teams. That is equivalent to having n boxes, and balls with two colors. Call them black and white. There are three black balls and n-3 white balls. You want to enumerate all distinct ways to put the balls into the boxes. A configuration of balls and boxes is equivalant to a team -- associate each person with a box, and if the box has a black ball, then the person is on the team.
In the permutation example, bi equals one for all i. Thus enumerating balls and boxes gives you permutations.
In my last example, I'll write a general balls-in-boxes program. As with the others, it is recursive. To start with the explanation, here is the scaffolding -- a class named BallsInBoxes that has the definition of the balls, a vector to hold the boxes, and a method called GenInstances() that will do the enumeration. Its main simply reads in the ball colors from standard input and prints them out. The program is in src/balls-in-boxes-0.cpp:
#include <vector> #include <map> #include <iostream> #include <cstdio> #include <cstdlib> using namespace std; /* Our class definition for the "Balls in Boxes" problem. */ class BallsInBoxes { public: map <string,int> balls; // Key = color of the ball. Val = # of balls with that color vector <string> boxes; // We will put the colors into each of the boxes. void GenInstances(size_t index); // Recursive method to solve the problem. }; /* Recursive method to solve the problem. We'll write this later. */ void BallsInBoxes::GenInstances(size_t index) { (void) index; } int main() { BallsInBoxes B; /* The instance of the BallsInBoxes class. */ string s; /* This is for reading in the color of each ball. */ int nb; /* This stores the number of balls/boxes while reading */ map <string, int>::iterator mit; nb = 0; /* Read the balls as strings on standard input. */ while (cin >> s) { B.balls[s]++; nb++; } B.boxes.resize(nb); /* Print general info. */ printf("Total balls & boxes: %d\n", nb); for (mit = B.balls.begin(); mit != B.balls.end(); mit++) { printf("Color: %-10s # Balls: %d\n", mit->first.c_str(), mit->second); } B.GenInstances(0); // This does nothing for now. return 0; } |
Let's verify that it works as anticipated:
UNIX> echo red red blue yellow red blue | bin/balls-in-boxes-0 Total balls & boxes: 6 Color: blue # Balls: 2 Color: red # Balls: 3 Color: yellow # Balls: 1 UNIX> echo a b c d a b a a a a a | bin/balls-in-boxes-0 Total balls & boxes: 11 Color: a # Balls: 7 Color: b # Balls: 2 Color: c # Balls: 1 Color: d # Balls: 1 UNIX>Now, we write GenInstances(). At each call, there is a vector (which represents the boxes) that has been partially filled in with index boxes, and a map that contains the balls that have not been placed yet. For each color in the map, we place a ball into the next element of the vector, remove the ball from the map, and call the procedure recursively. After the recursive call, we replace the ball in the map.
Here it is, in src/balls-in-boxes-1.cpp:
void BallsInBoxes::GenInstances(int index) { int i; map <string, int>::iterator bit; /* Base case -- if you have placed all of the balls in boxes, print them out, and return. */ if (index == boxes.size()) { cout << boxes[0]; for (i = 1; i < boxes.size(); i++) cout << " " << boxes[i]; cout << endl; return; } /* For each color where you haven't placed a ball yet, place the ball, make a recursive call, and remove the ball. */ for (bit = balls.begin(); bit != balls.end(); bit++) { if (bit->second > 0) { boxes[index] = bit->first; bit->second--; GenInstances(index+1); bit->second++; /* I don't actually "remove" the ball here, because subsequent iterations of the loop, or subsequent recursive calls will overwrite boxes[index]. */ } } } |
Verify to yourself that it is giving you correct output in all of the examples below:
UNIX> echo a a b b | bin/balls-in-boxes-1 a a b b a b a b a b b a b a a b b a b a b b a a UNIX> echo a a a a | bin/balls-in-boxes-1 a a a a UNIX> echo a b c | bin/balls-in-boxes-1 a b c a c b b a c b c a c a b c b a UNIX> echo larry moe curly | bin/balls-in-boxes-1 curly larry moe curly moe larry larry curly moe larry moe curly moe curly larry moe larry curly UNIX> echo B B B W | bin/balls-in-boxes-1 B B B W B B W B B W B B W B B B UNIX>
void BallsInBoxes::GenInstances(int index) { int i; map <string, int>::iterator bit; if (index == boxes.size()) { cout << boxes[0]; for (i = 1; i < boxes.size(); i++) cout << " " << boxes[i]; cout << endl; return; } for (bit = balls.begin(); bit != balls.end(); bit++) { boxes[index] = bit->first; bit->second--; if (bit->second == 0) { balls.erase(bit); bit = balls.end(); } GenInstances(index+1); if (bit == balls.end()) { bit = balls.insert(make_pair(boxes[index], 1)).first; } else bit->second++; } } |
When a ball's second field goes to zero, I remove it from the map and then set bit to equal balls.end(). After the recursion, if bit equals balls.end(), then I re-insert the ball into the map. Yes, that balls.insert() call is ugly. On a map, the insert() method returns a pair -- an iterator to the element, and true if the element was actually inserted, or false if it was there already.
Try this with six balls -- three colors, where bi equals 2 for each of the three balls:
UNIX> echo a a b b c c | bin/balls-in-boxes-2 a a b b c c a a b c b c a a b c c b a a c b b c a a c b c b a a c c b b ....You'll see that it runs forever, and eventually you'll see lines like "a b a c a b". Obviously, we have a bug, since there should be two a's and two c's. I'm not going to trace this bug in detail, because it's too hard. I'll simply tell you what's going on. Look at this snippet of the code:
for (bit = balls.begin(); bit != balls.end(); bit++) { boxes[index] = bit->first; bit->second--; if (bit->second == 0) { balls.erase(bit); bit = balls.end(); } GenInstances(index+1); if (bit == balls.end()) { bit = balls.insert(make_pair(boxes[index], 1)).first; } else bit->second++; } |
The problem arises when you don't erase bit and then you make the recursive call. Suppose that bit points to the entry for "a" in the map. Later, in the recursion, you're going to delete the entry for "a", because it's second field will be zero. When all the recursion returns, bit is pointing to an erased iterator. I wish we got a seg fault when we incremented bit->second. However, what happens is that the C++ runtime system reuses memory, and bit is pointing to a valid iterator, which may no longer be for "a".
This is a brutal bug, but one that can happen when you store iterators that may become deleted. It is similar to storing a pointer which is subsequently free()'d or deleted. I've fixed this in src/balls-in-boxes-3.cpp, although I'm not proud of it. Here's the relevant code snippet. After the recursive call, I make sure that bit points to the proper iterator:
for (bit = balls.begin(); bit != balls.end(); bit++) { boxes[index] = bit->first; bit->second--; if (bit->second == 0) { balls.erase(bit); bit = balls.end(); } GenInstances(index+1); if (bit == balls.end()) { bit = balls.insert(make_pair(boxes[index], 1)).first; } else { bit = balls.find(boxes[index]); bit->second++; } |
I think the code for src/balls-in-boxes-1.cpp is far superior to the others. On the flip side, I don't think it's the best way to solve this. We should use a vector instead of a map; however, I don't think that it's worth the class or study time to go through that level of detail.
class BallsInBoxes { public: map <string,int> balls; vector <string> colors; vector <int> nballs; vector <string> boxes; void GenInstances(); }; void BallsInBoxes::GenInstances() { stack <int> Stack; int index, color, i; Stack.push(-1); while (!Stack.empty()) { color = Stack.top(); Stack.pop(); index = Stack.size(); // Base case -- if we're at the end of boxes, print it out and "return" if (index == boxes.size()) { cout << boxes[0]; for (i = 1; i < boxes.size(); i++) cout << " " << boxes[i]; cout << endl; } else { if (color != -1) { // We have just finished enumerating with "color" nballs[color]++; } // Find the next color to enumerate. // Note how this works when color started at -1. for (color++; color < nballs.size() && nballs[color] == 0; color++) ; // If we still have a color to enumerate, put it into boxes, push // the color onto the stack, and push -1 on the stack to enumerate the next index. if (color < nballs.size()) { boxes[index] = colors[color]; nballs[color]--; Stack.push(color); Stack.push(-1); } } } } int main() { BallsInBoxes B; map <string, int>::iterator bit; string s; int nb; while (cin >> s) B.balls[s]++; nb = 0; for (bit = B.balls.begin(); bit != B.balls.end(); bit++) { B.colors.push_back(bit->first); B.nballs.push_back(bit->second); nb += bit->second; } B.boxes.resize(nb); B.GenInstances(); } |
To make life easier, I've created the vectors colors and nballs. In this way, we can manage a stack of indices into the colors vector, which is less confusing than maintaining a stack of iterators of balls (I have that code in src/balls-in-boxes-5.cpp).
Concentrate on GenInstances(). It maintains a stack of indices of colors. Instead of making recursive calls, you push -1 on the stack. That's what we do initially. The stack contains indices of the colors that we're currently enumerating. Thus, the top of the stack is the color that we're enumerating in boxes[Stack.size()-1]. Our main processing loop therefore removes this element from the top of the stack and sets index to the new Stack.size().
If that element is -1, then we're starting a new enumeration. If that element is something else, then we've just finished enumerating that element. So we increment nballs to put that color back when we enumerate a new color. In either case, we now need to find the next color to enumerate, which is done by the for loop:
for (color++; color < nballs.size() && nballs[color] == 0; color++) ; |
If, after executing this for loop, color is equal to nballs.size(), then we're done enumerating this index. Otherwise, we need to enumerate the next color, which we do with the following code:
boxes[index] = colors[color]; nballs[color]--; Stack.push(color); Stack.push(-1); |
In this way, we mimic the recursive version of the program, only instead of actually performing recursion, we simply manage the stack. As always, if you are confused by this, copy it, put in some print statements, and trace it. Frankly, I think that the recursive version of this is much easier to both read and write, but this is a good exercise in seeing how a stack can replace recursion.
A final piece of code is in src/balls-in-boxes-5.cpp. This code also manages a stack, but instead of managing integers, it manages iterator to balls. Where -1 was in the code previously, we use balls.end(). Try to trace through it -- it is very similar to src/balls-in-boxes-4.cpp, except you need to be unafraid of iterators!