SRM 597, D2, 250-Pointer (LittleElephantAndDouble)
James S. Plank
Sun Aug 28 14:14:17 EDT 2016
In case Topcoder's servers are down
- You are given a vector of integers with 1 to 50 elements.
- Each integer is between 1 and 1,000,000,000.
- Your goal is to double each integer some number of times until all of the integers
equal each other.
- Return "YES" if that's possible, and "NO" if it's not.
Examples:
- Example 0: {1, 2} - "YES"
- Example 1: {1, 2, 3} - "NO"
- Example 2: {4, 8, 2, 1, 16} - "YES"
- Example 3: {94, 752, 94, 376, 1504} - "YES"
- Example 4: {148, 298, 1184} - "NO"
- Example 5: {7, 7, 7, 7} - "YES"
Testing yourself: The main.cpp file will generate random input with a vector size of
n if you call it with a command line argument of n that is > 5. As always,
there is a testing script in tests.sh, and the output should match answers.txt.
It should take under 5 seconds to run tests.sh
Three Solutions
I'm going to give you three solutions along with their running times.
All three will pass Topcoder's time limit, largely because the constraints are
so small. The second solution is the fastest in some circumstances, and the
third one is fastest in others, even though the Big-O complexity of the second is
the best.
Solution #1 -- Check each pair
(In Solution-1.cpp.)
One way to do this is to check each pair of elements and make sure that
you can create the larger one by doubling the smaller one successively.
I did that with a simple procedure:
bool check(int x, int y)
{
if (x > y) return check(y, x);
while (x < y) x *= 2;
return (x == y);
}
|
Ask yourself two questions about this procedure:
- Will it be slow? The answer is no, because you can double at most
30 times before you hit 109 (which roughly equals 230).
So this loop will execute for a maximum of 30 iterations.
- Should I be worried about integer overflow? As it turns out, no. The
maximum integer is (231-1), which is roughly 2,000,000,000. A little
larger, actually. The maximum value that you'll get to double is 999,999,999,
and that doubled is less than 2,000,000,000. You're good.
This solution is bad because
the loop that tests each pair does way too much work. When the answer is "YES,
it will check O(n2) pairs, which makes the running time
O(n2).
Solution #2 -- Remove the 2's and check each answer against A[0]
(In Solution-2.cpp.)
Rather then doubling elements to see if they equal each other, you can simply divide
out the two's -- while a number is even, divide it by two -- then all of the elements
of A should equal each other. The second solution does this, and then checks
each element against A[0]].
The running time complexity of this is O(n), because you are doing a loop
that executes at most 30 times (which is O(1)) for each element of the vector.
Solution #3 -- Sort and check adjacent elements
(In Solution-3.cpp.)
This solution starts by sorting the vector. Now each element is less than or equal to
the next element, so you can double it until it is greater than or equal to the next
element. If it is greater, then you return "NO". Otherwise, you move onto the next
element.
The running time complexity of this is O(n log n) for the sort, and then O(n)
for the testing. Since
O(n log n) is bigger than O(n), the complexity is O(n log n)
Interestingly, when you time solutions 2 and 3, there are times when each outperforms the
other:
UNIX> cp Solution-2.cpp LittleElephantAndDouble.cpp # This is how you compile and run Solution 2
UNIX> g++ main.cpp
UNIX> time ./a.out 2000000 # 0.93 seconds for 2,000,000
NO
0.093u 0.005s 0:00.10 90.0% 0+0k 0+0io 0pf+0w
UNIX> time ./a.out 3000000 # 2.47 seconds for 3,000,000
YES
0.247u 0.009s 0:00.26 92.3% 0+0k 0+0io 0pf+0w
UNIX> cp Solution-3.cpp LittleElephantAndDouble.cpp
UNIX> g++ main.cpp
UNIX> time ./a.out 2000000 # Slower for 2,000,000
NO
0.115u 0.005s 0:00.12 91.6% 0+0k 0+0io 0pf+0w
UNIX> time ./a.out 3000000 # Faster for 3,000,000
YES
0.172u 0.008s 0:00.18 94.4% 0+0k 0+0io 0pf+0w
UNIX>
Can you think of why this is so? Here are my answers:
When is Solution 2 faster? This is when the answer is "NO" and you can exit the
loop early. In this case, the solution is O(index of the first mismatch) rather
than O(n). If the mismatch occurs at the second element, then it exits pretty
much instantly. You'll note that Solution 3 starts by sorting the vector, so it always
does the O(n log n) work.
When is Solution 3 faster? When the answer is "YES". This goes in the face of the
big-O analysis, doesn't it? Let's think about the work performed by both solutions:
- Solution 2 has to divide out the twos for every element. The way the input is generated,
this is an is an average of 12 times per element.
- Solution 3 has to sort the vector, and that is O(n log n), but it is implemented
efficiently. Then, think about the data when the input is large and the answer is "YES" -- there
are at most 30 different values that can be in the vector. Since the vector is sorted, most
adjacent values are equal to each other -- there will be very little doubling. So the
loop that traverses the vector is probably 12 times faster than the loop for Solution 2.
That makes up for the extra time in sorting. Were we to keep scaling n, I would
anticipate that solution 2 would be faster in all cases.