SRM 689, D2, 250-Pointer (SimilarUserDetection)
James S. Plank
Thu Mar 30 13:17:56 EDT 2017
Sat Oct 21 10:53:09 EDT 2023
In case Topcoder's servers are down
Here is a summary of the problem:
- Topcoder usernames, or "handles", are composed of upper-case letters, lower-case letters,
and numbers.
- Two handles are considered "similar" if they are the same, with the following exceptions:
- If one has an '0' where the other one has an 'O', then they are similar.
- If one has an 'l', '1' or 'I' where the other one has
an 'l', '1' or 'I', then they are similar.
- For example, "II11ll00OO" is similar to "I1lI1l0O0O".
- You are given a vector of strings called handles, which will be between 2 and 50
elements in size. Each of these is a string that is between 2 and 50 elements in size.
- If any pair of strings in handles is "similar" to each other, return the
string "Similar handles found".
- Otherwise, return "Similar handles not found".
Examples
# Input vector handles Return Value
- ----- ------ ------- ------ -----
0 {"top", "coder", "TOPCODER", "TOPC0DER"} "Similar handles found"
1 {"Topcoder", "topcoder", "t0pcoder", "topcOder"} "Similar handles not found"
2 {"same", "same", "same", "different"} "Similar handles found"
3 {"0123", "0I23", "0l23", "O123", "OI23", "Ol23"} "Similar handles found"
4 {"i23", "123", "456", "789", "000", "o", "O"} "Similar handles not found"
Testing yourself
I have a shell script
tests.sh
, that you can use to test your program. When you
run tests.sh, your answer should be identical to
answers.txt.
Hints
Try this on your own. If the maximum length of a string is s, and the number of
strings is n, then your solution should be O(sn).
I will give you more explanation below.
The simple way to attack this problem is to convert each string to a canonical form.
If two strings are "similar," then their canonical forms should be identical. That makes
it easy to test for similarity.
What I did was convert each character '0' to 'O', each '1' to 'l' and each 'I' to 'l'.
That does the trick -- after the conversion, two strings that were similar will be the same.
Simply insert each string into an unordered_set, and test the unordered_set's size.
If there were duplicates, then the unordered_set will be smaller than the vector handles.
Can you calculate the running time I gave above? You are doing n insertions into
an unordered_set. That will be O(n). Each comparison of two strings will take
potentially s comparisons of letters. That gives you the factor of s:
O(sn).