double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2); |
Both arrays are sorted, and have values from -1,000,000 to 1,000,000. Neither is empty. Leetcode bounds the arrays at a total of 3000 elements, but my final test has that total be 1,500,000.
Your program should determine the median of all of the elements in both nums1 and nums2. If the total number of elements is odd, then this is just the median element. If the total number of elements is even, then you have two "medians." Return their average.
Let's look at two examples. The first is in the file ex0.txt:
-9 -6 -3 8 4 5 6 |
In these examples, num1 is in the first line, and num2 is in the second. Since the are 7 total values, there is one median: 4.
Now ex1.txt:
-7 -5 -3 6 -9 -8 -8 0 3 5 9 9 |
There are 12 values, so there are two medians: 0 and -3. You return their average: -1.5.
Your solution must be O(log n), where n is the total number of elements.
If you enter "RTEST" into the program instead of two lists of numbers, then it will create two random arrays, one with 1,000,000 and one with 500,000. It then calls the method and times it. It has to run in under 0.0001 seconds.
So here's the approach that I eventually took. You are going to do a binary search on the median. You'll start with low being a value smaller than the smallest value, and high being a value greater than the largest value. And you do a binary search to find the median. Let's just worry about situations where there is just one median. At each stage in the binary search, you have m a value that you will test to be the median. It will be in the middle of low and high. I tested with the following method:
int am_i_median(int m, vector<int>& nums1, vector<int>& nums2); |
I had this return one of 6 values:
void indices(int m, vector <int> &n, int &l, int &e, int &h); |
What this method does is set three values:
Finally, I wrote findMedianSortedArrays(). I started with odd totals, because that was easier, and it only requires one simple binary search.
With even totals, you have to deal with finding the low median and the high median. What I did was this -- if I found the high median first, then I set low back to its minimum value, and high to the median minus 2. And then I continued the binary search until I found the low median. You have to handle the case when am_i_median() returns 3 differently, depending on whether you are looking for the low median or the high median. Think about it.
When you get this working, you'll be happy, but you won't be done -- it's O(n log (max-min)), so it's too slow. So:
So the total speed is O((log n)(log (max-min))). I know that's not technically O(log n), but with each log capped at 21 (the log of 2,000,000 is 21), you're good.
I won't rob you the joy of figuring all of this out yourself, so there's no solution here. Have fun!!