One example is typified by Topcoder problem SRM-344-D2-500 (Don't bother reading the description yet). This problem gives you an encrypted string, and a function for encrypting and decrypting given a password, which is itself a three-letter string. However, you are not given the password. The problem description gives you a way to say that decryption is successful. Your job is to to furnish the password that decrypts the string.
The way to solve this problem is to enumerate all possible passwords, and then test to see which of the possible ones decrypts successfully. Here, n is 26 (passwords are composed of lower-case letters) and l is three. Thus, there are 263 = 17576 potential passwords. That's a pretty small number, so we should be able to perform the enumeration quickly.
Ok, now that I've given you an overview, let's solve the problem. Go ahead and read it. You are given a parameter cipherText which is a string of 3 to 50 characters that contains only lowercase letters and spaces. You are to discover a password, which is a three-letter string containing only lowercase letters, that would generate the cipherText from a legal clear text string. What is a legal clear text string? It is one composed of words and spaces, where words are contiguous lowercase letters between 2 and 8 characters in size that contain at least one vowel. Words are always separated by a single space. When you discover the password, you are to return the clear text.
To generate cipherText from clear text, you associate each potential character c in the clear text with a number. That number is (c-'a'+1) if c is a lowercase latter, and 0 is c is a space. To generate the first character of cipherText, you add the first character of the clear text to the first letter of the password, take the result modulo 27 and convert the number back to a character. To generate the second character of cipherText, you add the second character of the clear text to the second letter of the password, take the result modulo 27 and convert the number back to a character. Keep doing this, and when you run out of password letters, start back at the beginning.
Thus, for example, if the clear text is "a bee" and the password is "abz", then the cipherText will be calculated with the following table:
| i | clear_text[i] | password[i%3] | ciphertext[i] |
| 0 | 'a' = 1 | 'a' = 1 | 2 = 'b' |
| 1 | ' ' = 0 | 'b' = 2 | 2 = 'b' |
| 2 | 'b' = 2 | 'z' = 26 | 28%27=1 = 'a' |
| 3 | 'e' = 5 | 'a' = 1 | 6 = 'f' |
| 3 | 'e' = 5 | 'b' = 2 | 7 = 'g' |
So, the ciphertext will be "bbafg".
Armed with this information, what we want to do is generate all 17576 potential passwords, and for each password, we decrypt the message and test whether it corresponds to legal clear text. If so, we return the clear text. The problem description says that we are guaranteed that there will be one unique password that generates legal clear text, and that makes our life simpler.
We'll start by generating all of those passwords. The easiest way to do this is to simply enumerate all numbers from 0 to 263-1 and convert each number to a three-digit string. Let the number be i. Then password[0] will equal i%26. Password[1] will equal (i/26)%26, and password[2] will equal ((i/26)/26)%26. The code is in gen-passwords.cpp. Pay attention to how each iteration divides j by 26 so that the password digits are calculated correctly:
#include <string>
#include <iostream>
using namespace std;
main()
{
string password;
int i, j, k;
password.resize(3);
for (i = 0; i < 26*26*26; i++) {
j = i;
for (k = 0; k < 3; k++) {
password[k] = 'a' + (j%26);
j /= 26;
}
cout << password << endl;
}
}
|
We run it, and all looks good:
UNIX> g++ gen-passwords.cpp UNIX> a.out | head aaa baa caa daa eaa faa gaa haa iaa jaa UNIX> a.out | tail qzz rzz szz tzz uzz vzz wzz xzz yzz zzz UNIX>Now, we can use this to solve the problem -- for each password, we decrypt and test the output: (in SimpleRotationDecoder.cpp):
class SimpleRotationDecoder {
public:
string decode(string cipherText);
int is_legal();
string cleartext;
};
string SimpleRotationDecoder::decode(string ciphertext)
{
string password;
string rv;
int i, j, k;
password.resize(3);
cleartext.resize(ciphertext.size()+1, 'a'-1);
for (i = 0; i < ciphertext.size(); i++) {
if (ciphertext[i] == ' ') ciphertext[i] = 'a'-1;
}
for (i = 0; i < 26*26*26; i++) {
j = i;
for (k = 0; k < 3; k++) {
password[k] = 'a' + (j%26);
j /= 26;
}
for (j = 0; j < ciphertext.size(); j++) {
k = (ciphertext[j]-('a'-1)) - (password[j%3]-('a'-1));
if (k < 0) k += 27;
k += ('a'-1);
cleartext[j] = k;
}
if (is_legal()) {
cleartext.resize(cleartext.size()-1);
for (i = 0; i < cleartext.size(); i++) {
if (cleartext[i] == 'a'-1) cleartext[i] = ' ';
}
return cleartext;
}
}
return "";
}
|
A few things about this code. First, we change all spaces to the character ('a'-1) (it's a backquote). That way, we don't have to special-case the space, and we can simply add/subtract characters. Second, we sentinelize cleartext with a backquote at the end. It makes our life easier writing is_legal().
Second, we perform the decryption calculation using an integer k rather than a big arithmetic calculation on chars. That's because I don't want to worry about being constrained to a single char in the calculation.
When we find a legal clear text, we have to convert those backquotes back into spaces, remove the sentinel, and return it.
Now, look at is_legal():
int SimpleRotationDecoder::is_legal()
{
string vowels = "aeiou";
int i, last_space, vowel_present;
last_space = -1;
vowel_present = 0;
for (i = 0; i < cleartext.size(); i++) {
if (cleartext[i] == 'a'-1) {
if (i-last_space <= 2 || !vowel_present) return 0;
last_space = i;
vowel_present = 0;
} else {
if (vowels.find(cleartext[i]) != string::npos) vowel_present = 1;
if (i-last_space > 8) return 0;
}
}
return 1;
}
|
It is straightforward code, but note how the sentinel and the use of last_space helps us, since we don't have to test explicitly that the first and last characters are lowercase letters. Also, note how I use the find() method of strings to test whether the character is a vowel.
Compile, test, submit!
Another good Topcoder problem of this ilk is the Topcoder Open 2007, Qualification Round 1, 500-pointer (The "Bigital" problem, which I gave in lab). I have lecture notes on this here.
For example, suppose I have four people, Larry, Curly, Moe and Shemp, and I want to enumerate all possible ways that I can make a team from them, where a team is a collection of one or more people. Each potential team may be represented by a bit string as defined above, and as long as the bit string has a one in it, it represents a valid team. Here are all teams and their representative strings:
1000 Larry 0100 Curly 1100 Larry Curly 0010 Moe 1010 Larry Moe 0110 Curly Moe 1110 Larry Curly Moe 0001 Shemp 1001 Larry Shemp 0101 Curly Shemp 1101 Larry Curly Shemp 0011 Moe Shemp 1011 Larry Moe Shemp 0111 Curly Moe Shemp 1111 Larry Curly Moe Shemp |
Enumerating these bit strings can be done as in the previous section: l equals N and n equals 2. However, instead of using div and mod, we can use bit arithmetic. (Here is where you review the CS140 lecture notes if you are hazy on bit arithmetic). First, 2N may be calculated with bit arithmetic as (1 << N). Next, if you want to see if the j-th digit of the bit string s is one, test whether (s & (1 << j)) is non-zero. This gives us all we need to know to enumerate sets.
For example, the program gen-teams.cpp reads in names from standard input, and then generates all possible teams from those names. It is what I used to generate the teams above.
int main()
{
vector <string> people;
string s;
int i, j;
while (cin >> s) people.push_back(s);
if (s.size() > 30) {
cerr << "Sorry, not generating more than 2^30 teams\n";
exit(1);
}
for (i = 1; i < (1 << people.size()); i++) {
for (j = 0; j < people.size(); j++) { // Print the bit strings
printf("%c", (i & (1 << j)) ? '1' : '0');
}
for (j = 0; j < people.size(); j++) { // Print the teams
if (i & (1 << j)) printf(" %s", people[j].c_str());
}
printf("\n");
}
exit(0);
}
|
There's not much subtle here -- once you know how to generate 2N and to test whether digit j is equal to one using bit arithmetic, the rest falls out naturally.
Viewing this in a slightly different way, we can represent sets using bits. For example, suppose I want to represent a set of distinct numbers between 0 and 31. I can do that using a 32-element bit string, where bit j is one if j is an element of the set. Since ints are 32-bits, I can represent each set with one integer. That's nice. I can represent a set of distinct numbers between 0 and 63 with a long long. That can also be nice.
An good example Topcoder problem that uses this kind of enumeration is SRM 489, D2, 500-point problem. I have lecture notes about solving that problem here.
Instead, you can often write a recursive function to help you enumerate. An example is in gen-3-teams.cpp. This program generates all three-person teams from a collection of people entered on standard input:
class People {
public:
vector <string> people;
vector <string> team;
void GenTeams(int index, int npeople);
};
void People::GenTeams(int index, int npeople)
{
int i;
if (npeople == 0) {
cout << team[0];
for (i = 1; i < team.size(); i++) cout << " " << team[i];
cout << endl;
return;
}
if (npeople > people.size() - index) return;
/* Think about why the statement above is better than:
if (index == people.size()) return;
*/
team.push_back(people[index]);
GenTeams(index+1, npeople-1);
team.pop_back();
GenTeams(index+1, npeople);
}
int main()
{
People P;
string s;
int i, j;
while (cin >> s) P.people.push_back(s);
P.GenTeams(0, 3);
}
|
The heart of the program is GenTeams, which is a recursive enumerator. It first checks for base cases. If the call is not a base case, then it makes two recursive calls. The first adds person[index] to the team, and then makes the recursive call with npeople-1. The second doesn't add person, and merely makes the recursive call with npeople. Trace through it if you find this confusing -- copy the code and put in some extra print statements.
UNIX> echo Larry Curly Moe Shemp | gen-3-teams Larry Curly Moe Larry Curly Shemp Larry Moe Shemp Curly Moe Shemp UNIX> echo Larry Curly Moe Shemp Baby-Daisy | gen-3-teams Larry Curly Moe Larry Curly Shemp Larry Curly Baby-Daisy Larry Moe Shemp Larry Moe Baby-Daisy Larry Shemp Baby-Daisy Curly Moe Shemp Curly Moe Baby-Daisy Curly Shemp Baby-Daisy Moe Shemp Baby-Daisy UNIX>
class People {
public:
vector <string> people;
void GenPermutations(int index);
};
void People::GenPermutations(int index)
{
int i;
string tmp;
/* Base case -- we're done - print out the vector */
if (index == people.size()) {
cout << people[0];
for (i = 1; i < people.size(); i++) cout << " " << people[i];
cout << endl;
return;
}
/* Otherwise, for each element of the vector, swap it with the element
in index, call GenPermutations() recursively on the remainder of the
vector, and then swap it back. */
for (i = index; i < people.size(); i++) {
tmp = people[i]; /* Swap people[index] with people[i] */
people[i] = people[index];
people[index] = tmp;
GenPermutations(index+1);
tmp = people[i]; /* Swap back */
people[i] = people[index];
people[index] = tmp;
}
}
|
How many lines are there if you are permuting n items? Well, there are n ways to choose the first item, then n-1 to choose the second, etc. So, there are (n!) ways to permute n items, which will lead to n! lines of output:
UNIX> echo Larry Curly Moe | gen-permutations
Larry Curly Moe
Larry Moe Curly
Curly Larry Moe
Curly Moe Larry
Moe Curly Larry
Moe Larry Curly
UNIX> echo 1 2 3 4 | gen-permutations
1 2 3 4
1 2 4 3
1 3 2 4
1 3 4 2
1 4 3 2
1 4 2 3
2 1 3 4
2 1 4 3
2 3 1 4
2 3 4 1
2 4 3 1
2 4 1 3
3 2 1 4
3 2 4 1
3 1 2 4
3 1 4 2
3 4 1 2
3 4 2 1
4 2 3 1
4 2 1 3
4 3 2 1
4 3 1 2
4 1 3 2
4 1 2 3
UNIX> echo 1 2 3 | gen-permutations | wc
6 18 36
UNIX> echo 1 2 3 4 | gen-permutations | wc
24 96 192
UNIX> echo 1 2 3 4 5 | gen-permutations | wc
120 600 1200
UNIX> echo 1 2 3 4 5 6 | gen-permutations | wc
720 4320 8640
UNIX> echo 1 2 3 4 5 6 7 | gen-permutations | wc
5040 35280 70560
UNIX>
For example, in the gen-3-teams problem, you have n people and you want to enumerate the number of three-person teams. That is equivalent to having n boxes, and balls with two colors. Call them black and white. There are three black balls and n-3 white balls. You want to enumerate all distinct ways to put the balls into the boxes. A configuration of balls and boxes is equivlant to a team -- associate each person with a box, and if the box has a black ball, then the person is on the team.
In the permutation example, b_i equals one for all i. Thus enumerating balls and boxes gives you permutations.
In my last example, I'll write a general balls-in-boxes program. As with the others, it is recursive. At each call, there is a vector (which represents the boxes) that it has been partially filled in, and a map that contains the balls that have not been placed yet. For each color in the map, we place a ball into the next element of the vector, remove the ball from the map, and call the procedure recursively. After the recursive call, we replace the ball in the map.
Here it is, in balls-in-boxes-1.cpp:
class BallsInBoxes {
public:
map <string,int> balls;
vector <string> boxes;
void GenInstances(int index);
};
void BallsInBoxes::GenInstances(int index)
{
int i;
map <string, int>::iterator bit;
if (index == boxes.size()) {
cout << boxes[0];
for (i = 1; i < boxes.size(); i++) cout << " " << boxes[i];
cout << endl;
return;
}
for (bit = balls.begin(); bit != balls.end(); bit++) {
if (bit->second > 0) {
boxes[index] = bit->first;
bit->second--;
GenInstances(index+1);
bit->second++;
}
}
}
int main()
{
BallsInBoxes B;
string s;
int nb;
nb = 0;
while (cin >> s) {
B.balls[s]++;
nb++;
}
B.boxes.resize(nb);
B.GenInstances(0);
}
|
Verify to yourself that it is giving you correct output in all of the examples below:
UNIX> echo a a b b | balls-in-boxes-1 a a b b a b a b a b b a b a a b b a b a b b a a UNIX> echo a a a a | balls-in-boxes-1 a a a a UNIX> echo a b c | balls-in-boxes-1 a b c a c b b a c b c a c a b c b a UNIX> echo larry moe curly | balls-in-boxes-1 curly larry moe curly moe larry larry curly moe larry moe curly moe curly larry moe larry curly UNIX> echo B B B W | balls-in-boxes-1 B B B W B B W B B W B B W B B B UNIX>
void BallsInBoxes::GenInstances(int index)
{
int i;
map <string, int>::iterator bit;
if (index == boxes.size()) {
cout << boxes[0];
for (i = 1; i < boxes.size(); i++) cout << " " << boxes[i];
cout << endl;
return;
}
for (bit = balls.begin(); bit != balls.end(); bit++) {
boxes[index] = bit->first;
bit->second--;
if (bit->second == 0) {
balls.erase(bit);
bit = balls.end();
}
GenInstances(index+1);
if (bit == balls.end()) {
bit = balls.insert(make_pair(boxes[index], 1)).first;
} else bit->second++;
}
}
|
When a ball's second field goes to zero, I remove it from the map and then set bit to equal balls.end(). After the recursion, if bit equals balls.end(), then I re-insert the ball into the map. Yes, that balls.insert() call is ugly. On a map, the insert() method returns a pair -- an iterator to the element, and true if the element was actually inserted, or false if it was there already.
Try this with six balls -- three colors, where bi equals 2 for each of the three balls:
UNIX> echo a a b b c c | balls-in-boxes-2 a a b b c c a a b c b c a a b c c b a a c b b c a a c b c b a a c c b b ....You'll see that it runs forever, and eventually you lines like "a b a c a b". Obviously, we have a bug, since there should be two a's and two c's. I'm not going to trace this bug in detail, because it's too hard. I'll simply tell you what's going on. Look at this snippet of the code:
for (bit = balls.begin(); bit != balls.end(); bit++) {
boxes[index] = bit->first;
bit->second--;
if (bit->second == 0) {
balls.erase(bit);
bit = balls.end();
}
GenInstances(index+1);
if (bit == balls.end()) {
bit = balls.insert(make_pair(boxes[index], 1)).first;
} else bit->second++;
}
|
The problem arises when you don't erase bit and then you make the recursive call. Suppose that bit points to the entry for "a" in the map. Later, in the recursion, you're going to delete the entry for "a", because it's second field will be zero. When all the recursion returns, bit is pointing to an erased iterator. I wish we got a seg fault when we incremented bit->second. However, what happens is that the C++ runtime system reuses memory, and bit is pointing to a valid iterator, which may no longer be for "a".
This is a brutal bug, but one that can happen when you store iterators that may become deleted. It is similar to storing a pointer which is subsequently free()'d or deleted. I've fixed this in balls-in-boxes-3.cpp, although I'm not proud of it. Here's the relevant code snippet. After the recursive call, I make sure that bit points to the proper iterator:
for (bit = balls.begin(); bit != balls.end(); bit++) {
boxes[index] = bit->first;
bit->second--;
if (bit->second == 0) {
balls.erase(bit);
bit = balls.end();
}
GenInstances(index+1);
if (bit == balls.end()) {
bit = balls.insert(make_pair(boxes[index], 1)).first;
} else {
bit = balls.find(boxes[index]);
bit->second++;
}
|
I think the code for balls-in-boxes-1.cpp is far superior to the others. On the flip side, I don't think it's the best way to solve this. We should use a vector instead of a map; however, I don't think that it's worth the class or study time to go through that level of detail.
class BallsInBoxes {
public:
map <string,int> balls;
vector <string> colors;
vector <int> nballs;
vector <string> boxes;
void GenInstances();
};
void BallsInBoxes::GenInstances()
{
stack <int> Stack;
int index, color, i;
Stack.push(-1);
while (!Stack.empty()) {
color = Stack.top();
Stack.pop();
index = Stack.size();
// Base case -- if we're at the end of boxes, print it out and "return"
if (index == boxes.size()) {
cout << boxes[0];
for (i = 1; i < boxes.size(); i++) cout << " " << boxes[i];
cout << endl;
} else {
if (color != -1) { // We have just finished enumerating with "color"
nballs[color]++;
}
// Find the next color to enumerate.
// Note how this works when color started at -1.
for (color++; color < nballs.size() && nballs[color] == 0; color++) ;
// If we still have a color to enumerate, put it into boxes, push
// the color onto the stack, and push -1 on the stack to enumerate the next index.
if (color < nballs.size()) {
boxes[index] = colors[color];
nballs[color]--;
Stack.push(color);
Stack.push(-1);
}
}
}
}
int main()
{
BallsInBoxes B;
map <string, int>::iterator bit;
string s;
int nb;
while (cin >> s) B.balls[s]++;
nb = 0;
for (bit = B.balls.begin(); bit != B.balls.end(); bit++) {
B.colors.push_back(bit->first);
B.nballs.push_back(bit->second);
nb += bit->second;
}
B.boxes.resize(nb);
B.GenInstances();
}
|
To make life easier, I've created the vectors colors and nballs. In this way, we can manage a stack of indices into the colors vector, which is less confusing than maintaining a stack of iterators of balls (I have that code in balls-in-boxes-5.cpp).
Concentrate on GenInstances(). It maintains a stack of indices of colors. Instead of making recursive calls, you push -1 on the stack. That's what we do initially. The stack contains indices of if the colors that we're currently enumerating. Thus, the top of the stack is the color that we're enumerating in boxes[Stack.size()-1]. Our main processing loop therefore removes this element from the top of the stack and sets index to the new Stack.size().
If that element is -1, then we're starting a new enumeration. If that element is something else, then we've just finished enumerating that element. So we increment nballs to put that color back when we enumerate a new color. In either case, we now need to find the next color to enumerate, which is done by the for loop:
for (color++; color < nballs.size() && nballs[color] == 0; color++) ; |
If, after executing this for loop, color is equal to nballs.size(), then we're done enumerating this index. Otherwise, we need to enumerate the next color, which we do with the following code:
boxes[index] = colors[color]; nballs[color]--; Stack.push(color); Stack.push(-1); |
In this way, we mimic the recursive version of the program, only instead of actually performing recursion, we simply manage the stack. As always, if you are confused by this, copy it, put in some print statements, and trace it. Frankly, I think that the recursive version of this is much easier to both read and write, but this is a good exercise in seeing how a stack can replace recursion.
A final piece of code is in balls-in-boxes-5.cpp. This code also manages a stack, but instead of managing integers, it manages iterator to balls. Where -1 was in the code previously, we use balls.end(). Try to trace through it -- it is very similar to balls-in-boxes-4.cpp, except you need to be unafraid of iterators!