Leetcode.com problem 4: "Median-Of-Two-Sorted-Arrays"

James S. Plank

Fri Aug 5 07:12:15 PM EDT 2022

In case Leetcode's servers are down

You are to write a method in the class Solution with the following prototype:

double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2);

Both arrays are sorted, and have values from -1,000,000 to 1,000,000. Neither is empty. Leetcode bounds the arrays at a total of 3000 elements, but my final test has that total be 1,500,000.

Your program should determine the median of all of the elements in both nums1 and nums2. If the total number of elements is odd, then this is just the median element. If the total number of elements is even, then you have two "medians." Return their average.

Let's look at two examples. The first is in the file ex0.txt:

-9 -6 -3 8
4 5 6

In these examples, num1 is in the first line, and num2 is in the second. Since the are 7 total values, there is one median: 4.

Now ex1.txt:

-7 -5 -3 6 
-9 -8 -8 0 3 5 9 9 

There are 12 values, so there are two medians: 0 and -3. You return their average: -1.5.

Your solution must be O(log n), where n is the total number of elements.


Testing, and the driver main()

In main(), you may enter two lines of numbers. It will sort them, make the call, and then verify that it is correct. It prints 1 if it is correct, and 0 if it is not correct.

If you enter "RTEST" into the program instead of two lists of numbers, then it will create two random arrays, one with 1,000,000 and one with 500,000. It then calls the method and times it. It has to run in under 0.0001 seconds.


An approach

Please feel free to think about this one for a while, and try to solve it on your own. I had a very difficult time coming up with a good solution. I tried do to something that involved looking at the medians of both arrays and then trimming them down. It never worked, and the code was icky.

So here's the approach that I eventually took. You are going to do a binary search on the median. You'll start with low being a value smaller than the smallest value, and high being a value greater than the largest value. And you do a binary search to find the median. Let's just worry about situations where there is just one median. At each stage in the binary search, you have m a value that you will test to be the median. It will be in the middle of low and high. I tested with the following method:

int am_i_median(int m, vector<int>& nums1, vector<int>& nums2);

I had this return one of 6 values:

To write this, I wrote yet another method:

void indices(int m, vector <int> &n, int &l, int &e, int &h);

What this method does is set three values:

I wrote this in a linear fashion to start -- with just a for() loop that counts. Then I wrote am_i_median() -- you can calculate each of the outcomes above from the l, e and h values that you get from calling indices twice -- once on nums1 and once on nums2. You may have to scratch your head once or twice figuring this out. Use examples.

Finally, I wrote findMedianSortedArrays(). I started with odd totals, because that was easier, and it only requires one simple binary search.

With even totals, you have to deal with finding the low median and the high median. What I did was this -- if I found the high median first, then I set low back to its minimum value, and high to the median minus 2. And then I continued the binary search until I found the low median. You have to handle the case when am_i_median() returns 3 differently, depending on whether you are looking for the low median or the high median. Think about it.

When you get this working, you'll be happy, but you won't be done -- it's O(n log (max-min)), so it's too slow. So:


Making indices() O(log n)

To make this fast enough, you need to make indices() O(log n) instead of O(n). What I did was two binary searches -- the first one finds l and the second one finds l+e. From those two values, you can calculate l, e and h.

So the total speed is O((log n)(log (max-min))). I know that's not technically O(log n), but with each log capped at 21 (the log of 2,000,000 is 21), you're good.

I won't rob you the joy of figuring all of this out yourself, so there's no solution here. Have fun!!