SRM 605, D2, 250-Pointer (AlienAndPassword)
James S. Plank
Sun Jul 21 09:39:21 EDT 2019
In case Topcoder's servers are down
(Please use the workflow in the problem Cryptography).
Here is a summary of the problem:
- You are given a string S.
- Your job will be to delete one character from S to obtain a potential password.
- Return how many potential passwords that you can generate in this way.
- Topcoder constrains S to be between 1 and 50 characters, but I am going
to constrain them to be between 1 and 100,000 characters.
The examples
Example |
S |
Answer |
0 |
"A" |
1 |
1 |
"ABA" |
3 |
2 |
AABACCCCABAA |
7 |
3 |
"AGAAGAHHHHFTQLLAPUURQQRRRUFJJSBSZVJZZZ" |
26 |
4 |
"ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ" |
1 |
Testing yourself
I have a shell script
tests.sh, that you can use to test your program. When you
run tests.sh, your answer should be identical to
answers.txt. The shell script should take roughly a second.
I have a second shell script called
time-tests.sh -- run this to see timings of each group
of 20 tests from tests.sh. The input size in of these groups
grows by a factor of 10 from the previous group.
Hints
Try this on your own before reading further. Even if you run this using topcoder, and
have it pass the system test, please compile and run it on your own machine, and see how
long tests.sh and
time-tests.sh take.
The topcoder constraints on this problem are so small that pretty much any solution will
work. I'm sure that's why the correct percentage here is so high.
In BadSolution.cpp, I wrote up about the
worst solution that I could think of (that wasn't really trying to be bad).
It passed Topcoder's system tests easily. Let's take a look -- the comments
explain the solution.
/* find() is a procedure that finds string s in vector v.
If s is in v, find() returns the index of s.
Otherwise, it returns -1 */
int find(vector <string> v, string s)
{
int i;
for (i = 0; i < v.size(); i++) if (v[i] == s) return i;
return -1;
}
/* Solve the problem by creating each potential password,
and then trying to find it in a list of potential
passwords. If it's not there, add it to the list, and
then return the list's size. */
int AlienAndPassword::getNumber(string S)
{
vector <string> pot;
string p;
int i;
for (i = 0; i < S.size(); i++) {
p = S.substr(0,i) + S.substr(i+1);
if (find(pot, p) == -1) pot.push_back(p);
}
return pot.size();
}
|
This solves the problem directly by generating each potential password, and then searching
for it in a list of potential passwords. Why is this bad? Let's analyze it, when S
has n characters, and all potential passwords are different:
- find() is potentially O(n2), because the loop iterates O(n)
times, and the string comparison is O(n). Because you don't use reference variables,
It's an additional O(n) to copy v. However,
O(n2)+O(n2) is still O(n2).
- In getNumber(), the loop iterates O(n) times. Creating p is
O(n) as well, but that's not really important, because as mentioned above,
calling find() is
O(n2)+O(n2).
- For that reason, this program is O(n3).
This took 9 minutes on tests 61-80 (no compiler optimization).
I didn't bother timing tests 81-100, as they project to take about 150 hours!
You can use a set instead of a vector, and get this down to O(n2 log(n))
(which took 12 seconds on tests 81-100).
However, ask yourself some questions to prove when two potential passwords will be the same.
Let's take as an example, S = "AABBCA".
- If you delete character S[i] and character S[j], and they are different,
will the two potential passwords be the same or different? For example, if you delete
an 'A', regardless of which one it is, the resulting password will always be different
from when you delete a 'B'.
- If you delete character S[i] and character S[j], and they are the
same, but there is a different character between them,
will the two potential passwords be the same or different? For example, deleting the
first 'A' in "AABBCA" will give you a different password than deleting the last one.
- This means that we can focus on characters that are the same, and there are no different
characters between them. In this case, deleting the characters will yield the same
password.
Now we have a strategy. If a character is the same as the previous character, then deleting
it will yield the same password as deleting the previous character. If it is different, then
it will yield a different password. So count the number of characters which are the same as
previous characters, subtract that total from S.size(), and you're done. O(n),
and 0.86 seconds for tests 81-100!