The following are topcoder problems that use enumeration to solve them. If you are finding enumeration difficult, then by all means, practice with these.
One example is typified by Topcoder problem SRM-344-D2-500 (Don't bother reading the description yet). This problem gives you an encrypted string, and a function for encrypting and decrypting given a password, which is itself a three-letter string. However, you are not given the password. The problem description gives you a way to say that decryption is successful. Your job is to to furnish the password that decrypts the string.
The way to solve this problem is to enumerate all possible passwords, and then test to see which of the possible ones decrypts successfully. Here, n is 26 (passwords are composed of lower-case letters) and l is three. Thus, there are 26^{3} = 17576 potential passwords. That's a pretty small number, so we should be able to perform the enumeration quickly.
Ok, now that I've given you an overview, let's solve the problem. Go ahead and read it (http://community.topcoder.com/stat?c=problem_statement&pm=7625&rd=10668). You are given a parameter cipherText which is a string of 3 to 50 characters that contains only lowercase letters and spaces. You are to discover a password, which is a three-letter string containing only lowercase letters, that would generate the cipherText from a legal clear text string. What is a legal clear text string? It is one composed of words and spaces, where words are contiguous lowercase letters between 2 and 8 characters in size that contain at least one vowel. Words are always separated by a single space. When you discover the password, you are to return the clear text.
To generate cipherText from clear text, you associate each potential character c in the clear text with a number. That number is (c-'a'+1) if c is a lowercase letter, and 0 if c is a space. To generate the first character of cipherText, you add the first character of the clear text to the first letter of the password, take the result modulo 27 and convert the number back to a character. To generate the second character of cipherText, you add the second character of the clear text to the second letter of the password, take the result modulo 27 and convert the number back to a character. Keep doing this, and when you run out of password letters, start back at the beginning.
Thus, for example, if the clear text is "a bee" and the password is "abz", then the cipherText will be calculated with the following table:
i | clear_text[i] | password[i%3] | ciphertext[i] |
0 | 'a' = 1 | 'a' = 1 | 2 = 'b' |
1 | ' ' = 0 | 'b' = 2 | 2 = 'b' |
2 | 'b' = 2 | 'z' = 26 | 28%27=1 = 'a' |
3 | 'e' = 5 | 'a' = 1 | 6 = 'f' |
3 | 'e' = 5 | 'b' = 2 | 7 = 'g' |
So, the ciphertext will be "bbafg".
Armed with this information, what we want to do is generate all 17576 potential passwords, and for each password, we decrypt the message and test whether it corresponds to legal clear text. If so, we return the clear text. The problem description says that we are guaranteed that there will be one unique password that generates legal clear text, and that makes our life simpler.
We'll start by generating all of those passwords. An easy way to do that is to have three nestesd for loops. However, I'm not going to to that here, because I want to show you a more general way to do it. What I'm going to do is enumerate all numbers from 0 to 26^{3}-1 and convert each number to a three-digit string. Let the number be i. Then password[0] will equal i%26. Password[1] will equal (i/26)%26, and password[2] will equal ((i/26)/26)%26. The code is in gen-passwords.cpp. Pay attention to how each iteration divides j by 26 so that the password digits are calculated correctly:
#include <string> #include <iostream> using namespace std; main() { string password; int i, j, k; password.resize(3); for (i = 0; i < 26*26*26; i++) { j = i; for (k = 0; k < 3; k++) { password[k] = 'a' + (j%26); j /= 26; } cout << password << endl; } } |
We run it, and all looks good:
UNIX> g++ gen-passwords.cpp UNIX> a.out | head aaa baa caa daa eaa faa gaa haa iaa jaa UNIX> a.out | tail qzz rzz szz tzz uzz vzz wzz xzz yzz zzz UNIX>Now, we can use this to solve the problem -- for each password, we decrypt and test the output: (in SimpleRotationDecoder.cpp):
class SimpleRotationDecoder { public: string decode(string cipherText); int is_legal(); string cleartext; }; string SimpleRotationDecoder::decode(string ciphertext) { string password; string rv; int i, j, k; /* Initialize password and cleartext. Convert all spaces in ciphertext to 'a'-1. */ password.resize(3); cleartext.resize(ciphertext.size()+1, 'a'-1); for (i = 0; i < ciphertext.size(); i++) { if (ciphertext[i] == ' ') ciphertext[i] = 'a'-1; } /* Next, enumerate all three-letter passwords */ for (i = 0; i < 26*26*26; i++) { j = i; for (k = 0; k < 3; k++) { password[k] = 'a' + (j%26); j /= 26; } /* Use the password to decipher "ciphertext" into "cleartext" */ for (j = 0; j < ciphertext.size(); j++) { k = (ciphertext[j]-('a'-1)) - (password[j%3]-('a'-1)); if (k < 0) k += 27; k += ('a'-1); cleartext[j] = k; } /* If the cleartext is legal, convert 'a'-1 back to spaces, remove that extra space at the end, and return the cleartext. */ if (is_legal()) { cleartext.resize(cleartext.size()-1); for (i = 0; i < cleartext.size(); i++) { if (cleartext[i] == 'a'-1) cleartext[i] = ' '; } return cleartext; } } /* Return the empty string if you have failed. */ return ""; } |
A few things about this code. First, we change all spaces to the character ('a'-1) (it's a backquote). That way, we don't have to special-case the space, and we can simply add/subtract characters. Second, we sentinelize cleartext with a backquote at the end. It makes our life easier writing is_legal().
Second, we perform the decryption calculation using an integer k rather than a big arithmetic calculation on chars. That's because I don't want to worry about being constrained to a single char in the calculation.
When we find a legal clear text, we have to convert those backquotes back into spaces, remove the sentinel, and return it.
Now, look at is_legal():
/* This method tests to see if the string "cleartext" is legal, according to the topcoder definition of legal. You should note first that I have changed spaces to 'a'-1, because then I don't have to put special case in to recognize spaces. Also, you should note that I have put an extra space at the end of the string, again to reduce the amount of special code that I have to write. That is a version of using a sentinel. */ int SimpleRotationDecoder::is_legal() { string vowels = "aeiou"; int i, last_space, vowel_present; last_space = -1; vowel_present = 0; for (i = 0; i < cleartext.size(); i++) { if (cleartext[i] == 'a'-1) { if (i-last_space <= 2 || !vowel_present) return 0; last_space = i; vowel_present = 0; } else { if (vowels.find(cleartext[i]) != string::npos) vowel_present = 1; if (i-last_space > 8) return 0; } } return 1; } |
It is straightforward code, but note how the sentinel and the use of last_space helps us, since we don't have to test explicitly that the first and last characters are lowercase letters. Also, note how I use the find() method of strings to test whether the character is a vowel.
Compile, test, submit!
Another good Topcoder problem of this ilk is the Topcoder Open 2007, Qualification Round 1, 500-pointer (The "Bigital" problem, which I give in lab). Problem description: http://community.topcoder.com/stat?c=problem_statement&pm=7619&rd=10730. (hints).
You'll note, this is the same problem as above -- you want to generate all l-length vectors v that contain numbers between zero and two. While we can use the same method as above with div and mod, when you are dealing with zeros and ones exclusively, you can use bit arithmetic to do the same task more quickly.
We'll motivate with an example. Suppose I have four people, Larry, Curly, Moe and Shemp, and I want to enumerate all possible ways that I can make a team from them, where a team is a collection of one or more people. Each potential team may be represented by a bit string as defined above, and as long as the bit string has a one in it, it represents a valid team. Here are all teams and their representative strings:
1000 Larry 0100 Curly 1100 Larry Curly 0010 Moe 1010 Larry Moe 0110 Curly Moe 1110 Larry Curly Moe 0001 Shemp 1001 Larry Shemp 0101 Curly Shemp 1101 Larry Curly Shemp 0011 Moe Shemp 1011 Larry Moe Shemp 0111 Curly Moe Shemp 1111 Larry Curly Moe Shemp |
Enumerating these bit strings can be done as in the previous section: l equals 4 and n equals 2. Again, instead of using div and mod, we can use bit arithmetic. (Here is where you review the CS140 lecture notes on bit arithmetic if you are hazy on bit arithmetic). First, 2^{l} may be calculated with bit arithmetic as (1 << l). Next, if you want to see if the j-th digit of the bit string s is one, test whether (s & (1 << j)) is non-zero. This gives us all we need to know to enumerate sets.
For example, the program gen-teams.cpp reads in names from standard input, and then generates all possible teams from those names. It is what I used to generate the teams above.
int main() { vector <string> people; string s; int i, j; while (cin >> s) people.push_back(s); if (s.size() > 30) { cerr << "Sorry, not generating more than 2^30 teams\n"; exit(1); } for (i = 1; i < (1 << people.size()); i++) { for (j = 0; j < people.size(); j++) { // Print the bit strings printf("%c", (i & (1 << j)) ? '1' : '0'); } for (j = 0; j < people.size(); j++) { // Print the teams if (i & (1 << j)) printf(" %s", people[j].c_str()); } printf("\n"); } exit(0); } |
There's not much subtle here -- once you know how to generate 2^{l} and to test whether digit j is equal to one using bit arithmetic, the rest falls out naturally.
I'm reiterating here, but once again, what we are doing is representing sets using integers and bits rather than the STL. When I want to represent a set of distinct numbers between 0 and 31. I can do it using a 32-element bit string, where bit j is one if j is an element of the set. Since ints are 32-bits, I can represent each set with one integer. That's nice. I can represent a set of distinct numbers between 0 and 63 with a long long. That can also be nice.
An good example Topcoder problem that you can solve with a power set enumeration is SRM 604, D2, 500-pointer (PowerOfThreeEasy). I have lecture notes for that here, and I go over the program in class.
A second good Topcoder problem that uses this kind of enumeration is SRM 489, D2, 500-point problem. (Problem description: http://community.topcoder.com/stat?c=problem_statement&pm=11191&rd=14242). This one is a little harder. I have lecture notes about solving that problem here.
Instead, you can often write a recursive function to help you enumerate. An example is in gen-3-teams.cpp. This program generates all three-person teams from a collection of people entered on standard input. The program maintains two variables: people, which is the vector of people, and team, which is a team that gets incrementally constructed. The heart of the program is GenTeams(int index, int npeople), which is a recursive method that builds all possible teams. Its two parameters are as follows:
GenTeams() first checks base cases -- if npeople is zero, then there are no people left to put on teams -- you can print teams and return. Next, check to see if there are enough people left in people to make the team. If not, return. Otherwise, you do two recursive calls. The first puts people[index] on the team and calls GenTeams(index+1, npeople-1). The second skips people[index] and simply calls GenTeams(index+1, npeople). In other words, you are enumerating all teams with people[index] on them, and all teams without people[index].
Here is the code:
class People { public: vector <string> people; vector <string> team; void GenTeams(int index, int npeople); }; void People::GenTeams(int index, int npeople) { int i; /* Base case -- if there are no more people to add, print out the team and return */ if (npeople == 0) { cout << team[0]; for (i = 1; i < team.size(); i++) cout << " " << team[i]; cout << endl; return; } /* This is a second base case -- if there are fewer people left to add than there are places left on the team, then it's impossible to finish, so simply return. Ask yourself why this is better than testing whether index is equal to people.size(), and returning if so. */ if (npeople > people.size() - index) return; /* Now, put the person in people[index] onto the team, and call GenTeams() recursively. Afterwards, take the person off of the team. */ team.push_back(people[index]); GenTeams(index+1, npeople-1); team.pop_back(); /* Finally, call GenTeams() recursively without putting people[index] on the team. */ GenTeams(index+1, npeople); } int main() { People P; string s; int i, j; while (cin >> s) P.people.push_back(s); P.GenTeams(0, 3); } |
Trace through the code if you find this confusing -- copy the code and put in some extra print statements.
UNIX> echo Larry Curly Moe Shemp | gen-3-teams Larry Curly Moe Larry Curly Shemp Larry Moe Shemp Curly Moe Shemp UNIX> echo Larry Curly Moe Shemp Baby-Daisy | gen-3-teams Larry Curly Moe Larry Curly Shemp Larry Curly Baby-Daisy Larry Moe Shemp Larry Moe Baby-Daisy Larry Shemp Baby-Daisy Curly Moe Shemp Curly Moe Baby-Daisy Curly Shemp Baby-Daisy Moe Shemp Baby-Daisy UNIX>
Finally, a very similar recursion may be used to generate all permutations of a collection of elements. In the example, I again read in names, and this time print out all permutations. The structure of the enumeration is again a recursion, which works as follows (in gen-permutations.cpp):
class People { public: vector <string> people; void GenPermutations(int index); }; void People::GenPermutations(int index) { int i; string tmp; /* Base case -- we're done - print out the vector */ if (index == people.size()) { cout << people[0]; for (i = 1; i < people.size(); i++) cout << " " << people[i]; cout << endl; return; } /* Otherwise, for each element of the vector, swap it with the element in index, call GenPermutations() recursively on the remainder of the vector, and then swap it back. */ for (i = index; i < people.size(); i++) { tmp = people[i]; /* Swap people[index] with people[i] */ people[i] = people[index]; people[index] = tmp; GenPermutations(index+1); tmp = people[i]; /* Swap back */ people[i] = people[index]; people[index] = tmp; } } |
How many lines are there if you are permuting n items? Well, there are n ways to choose the first item, then n-1 to choose the second, etc. So, there are (n!) ways to permute n items, which will lead to n! lines of output:
UNIX> echo Larry Curly Moe | gen-permutations Larry Curly Moe Larry Moe Curly Curly Larry Moe Curly Moe Larry Moe Curly Larry Moe Larry Curly UNIX> echo 1 2 3 4 | gen-permutations 1 2 3 4 1 2 4 3 1 3 2 4 1 3 4 2 1 4 3 2 1 4 2 3 2 1 3 4 2 1 4 3 2 3 1 4 2 3 4 1 2 4 3 1 2 4 1 3 3 2 1 4 3 2 4 1 3 1 2 4 3 1 4 2 3 4 1 2 3 4 2 1 4 2 3 1 4 2 1 3 4 3 2 1 4 3 1 2 4 1 3 2 4 1 2 3 UNIX> echo 1 2 3 | gen-permutations | wc 6 18 36 UNIX> echo 1 2 3 4 | gen-permutations | wc 24 96 192 UNIX> echo 1 2 3 4 5 | gen-permutations | wc 120 600 1200 UNIX> echo 1 2 3 4 5 6 | gen-permutations | wc 720 4320 8640 UNIX> echo 1 2 3 4 5 6 7 | gen-permutations | wc 5040 35280 70560 UNIX>A good example to test yourself is Topcoder SRM 554, Division 2, 500-pointer. I have written up lecture notes for it here.
A second problem is SRM 592, Division 2, 500-pointer. I used next_permutation() to solve it. Hints here.
A third, more difficult problem is Topcoder SRM 519, Division 2, 600-pointer. I have written up lecture notes for it here.
For example, in the gen-3-teams problem, you have n people and you want to enumerate the number of three-person teams. That is equivalent to having n boxes, and balls with two colors. Call them black and white. There are three black balls and n-3 white balls. You want to enumerate all distinct ways to put the balls into the boxes. A configuration of balls and boxes is equivalant to a team -- associate each person with a box, and if the box has a black ball, then the person is on the team.
In the permutation example, b_{i} equals one for all i. Thus enumerating balls and boxes gives you permutations.
In my last example, I'll write a general balls-in-boxes program. As with the others, it is recursive. At each call, there is a vector (which represents the boxes) that has been partially filled in, and a map that contains the balls that have not been placed yet. For each color in the map, we place a ball into the next element of the vector, remove the ball from the map, and call the procedure recursively. After the recursive call, we replace the ball in the map.
Here it is, in balls-in-boxes-1.cpp:
class BallsInBoxes { public: map <string,int> balls; vector <string> boxes; void GenInstances(int index); }; void BallsInBoxes::GenInstances(int index) { int i; map <string, int>::iterator bit; /* Base case -- if you have placed all of the balls in boxes, print them out, and return. */ if (index == boxes.size()) { cout << boxes[0]; for (i = 1; i < boxes.size(); i++) cout << " " << boxes[i]; cout << endl; return; } /* For each color where you haven't placed a ball yet, place the ball, make a recursive call, and remove the ball. */ for (bit = balls.begin(); bit != balls.end(); bit++) { if (bit->second > 0) { boxes[index] = bit->first; bit->second--; GenInstances(index+1); bit->second++; /* I don't actually "remove" the ball here, because subsequent iterations of the loop, or subsequent recursive calls will overwrite boxes[index]. */ } } } int main() { BallsInBoxes B; string s; int nb; nb = 0; while (cin >> s) { B.balls[s]++; nb++; } B.boxes.resize(nb); B.GenInstances(0); } |
Verify to yourself that it is giving you correct output in all of the examples below:
UNIX> echo a a b b | balls-in-boxes-1 a a b b a b a b a b b a b a a b b a b a b b a a UNIX> echo a a a a | balls-in-boxes-1 a a a a UNIX> echo a b c | balls-in-boxes-1 a b c a c b b a c b c a c a b c b a UNIX> echo larry moe curly | balls-in-boxes-1 curly larry moe curly moe larry larry curly moe larry moe curly moe curly larry moe larry curly UNIX> echo B B B W | balls-in-boxes-1 B B B W B B W B B W B B W B B B UNIX>
void BallsInBoxes::GenInstances(int index) { int i; map <string, int>::iterator bit; if (index == boxes.size()) { cout << boxes[0]; for (i = 1; i < boxes.size(); i++) cout << " " << boxes[i]; cout << endl; return; } for (bit = balls.begin(); bit != balls.end(); bit++) { boxes[index] = bit->first; bit->second--; if (bit->second == 0) { balls.erase(bit); bit = balls.end(); } GenInstances(index+1); if (bit == balls.end()) { bit = balls.insert(make_pair(boxes[index], 1)).first; } else bit->second++; } } |
When a ball's second field goes to zero, I remove it from the map and then set bit to equal balls.end(). After the recursion, if bit equals balls.end(), then I re-insert the ball into the map. Yes, that balls.insert() call is ugly. On a map, the insert() method returns a pair -- an iterator to the element, and true if the element was actually inserted, or false if it was there already.
Try this with six balls -- three colors, where b_{i} equals 2 for each of the three balls:
UNIX> echo a a b b c c | balls-in-boxes-2 a a b b c c a a b c b c a a b c c b a a c b b c a a c b c b a a c c b b ....You'll see that it runs forever, and eventually you'll see lines like "a b a c a b". Obviously, we have a bug, since there should be two a's and two c's. I'm not going to trace this bug in detail, because it's too hard. I'll simply tell you what's going on. Look at this snippet of the code:
for (bit = balls.begin(); bit != balls.end(); bit++) { boxes[index] = bit->first; bit->second--; if (bit->second == 0) { balls.erase(bit); bit = balls.end(); } GenInstances(index+1); if (bit == balls.end()) { bit = balls.insert(make_pair(boxes[index], 1)).first; } else bit->second++; } |
The problem arises when you don't erase bit and then you make the recursive call. Suppose that bit points to the entry for "a" in the map. Later, in the recursion, you're going to delete the entry for "a", because it's second field will be zero. When all the recursion returns, bit is pointing to an erased iterator. I wish we got a seg fault when we incremented bit->second. However, what happens is that the C++ runtime system reuses memory, and bit is pointing to a valid iterator, which may no longer be for "a".
This is a brutal bug, but one that can happen when you store iterators that may become deleted. It is similar to storing a pointer which is subsequently free()'d or deleted. I've fixed this in balls-in-boxes-3.cpp, although I'm not proud of it. Here's the relevant code snippet. After the recursive call, I make sure that bit points to the proper iterator:
for (bit = balls.begin(); bit != balls.end(); bit++) { boxes[index] = bit->first; bit->second--; if (bit->second == 0) { balls.erase(bit); bit = balls.end(); } GenInstances(index+1); if (bit == balls.end()) { bit = balls.insert(make_pair(boxes[index], 1)).first; } else { bit = balls.find(boxes[index]); bit->second++; } |
I think the code for balls-in-boxes-1.cpp is far superior to the others. On the flip side, I don't think it's the best way to solve this. We should use a vector instead of a map; however, I don't think that it's worth the class or study time to go through that level of detail.
class BallsInBoxes { public: map <string,int> balls; vector <string> colors; vector <int> nballs; vector <string> boxes; void GenInstances(); }; void BallsInBoxes::GenInstances() { stack <int> Stack; int index, color, i; Stack.push(-1); while (!Stack.empty()) { color = Stack.top(); Stack.pop(); index = Stack.size(); // Base case -- if we're at the end of boxes, print it out and "return" if (index == boxes.size()) { cout << boxes[0]; for (i = 1; i < boxes.size(); i++) cout << " " << boxes[i]; cout << endl; } else { if (color != -1) { // We have just finished enumerating with "color" nballs[color]++; } // Find the next color to enumerate. // Note how this works when color started at -1. for (color++; color < nballs.size() && nballs[color] == 0; color++) ; // If we still have a color to enumerate, put it into boxes, push // the color onto the stack, and push -1 on the stack to enumerate the next index. if (color < nballs.size()) { boxes[index] = colors[color]; nballs[color]--; Stack.push(color); Stack.push(-1); } } } } int main() { BallsInBoxes B; map <string, int>::iterator bit; string s; int nb; while (cin >> s) B.balls[s]++; nb = 0; for (bit = B.balls.begin(); bit != B.balls.end(); bit++) { B.colors.push_back(bit->first); B.nballs.push_back(bit->second); nb += bit->second; } B.boxes.resize(nb); B.GenInstances(); } |
To make life easier, I've created the vectors colors and nballs. In this way, we can manage a stack of indices into the colors vector, which is less confusing than maintaining a stack of iterators of balls (I have that code in balls-in-boxes-5.cpp).
Concentrate on GenInstances(). It maintains a stack of indices of colors. Instead of making recursive calls, you push -1 on the stack. That's what we do initially. The stack contains indices of if the colors that we're currently enumerating. Thus, the top of the stack is the color that we're enumerating in boxes[Stack.size()-1]. Our main processing loop therefore removes this element from the top of the stack and sets index to the new Stack.size().
If that element is -1, then we're starting a new enumeration. If that element is something else, then we've just finished enumerating that element. So we increment nballs to put that color back when we enumerate a new color. In either case, we now need to find the next color to enumerate, which is done by the for loop:
for (color++; color < nballs.size() && nballs[color] == 0; color++) ; |
If, after executing this for loop, color is equal to nballs.size(), then we're done enumerating this index. Otherwise, we need to enumerate the next color, which we do with the following code:
boxes[index] = colors[color]; nballs[color]--; Stack.push(color); Stack.push(-1); |
In this way, we mimic the recursive version of the program, only instead of actually performing recursion, we simply manage the stack. As always, if you are confused by this, copy it, put in some print statements, and trace it. Frankly, I think that the recursive version of this is much easier to both read and write, but this is a good exercise in seeing how a stack can replace recursion.
A final piece of code is in balls-in-boxes-5.cpp. This code also manages a stack, but instead of managing integers, it manages iterator to balls. Where -1 was in the code previously, we use balls.end(). Try to trace through it -- it is very similar to balls-in-boxes-4.cpp, except you need to be unafraid of iterators!