SRM 583, D2, 250-Pointer (SwappingDigits)

James S. Plank (with help from Allen McBride)

Sat Feb 15 10:45:08 EST 2014 (Revised Mon Sep 30 12:13:37 EDT 2019)

In case Topcoder's servers are down

Please use the workflow in the problem Cryptography.

Here is a summary of the problem:

The examples

Example Input String Answer
0 "596" "569"
1 "93561" "13569"
2 "5491727514" "1491727554"
3 "10234" "10234"
4 "93218910471211292416" "13218910471211292496"

I am going to present three ways to solve this problem -- one is very easy to write, and will run easily within Topcoder's time limits (but not with The other two require more thought, but end up with programs that are better. Go ahead and program up the easy one and submit/test it. Then, try one of the other ways. It's good programming and thought practice. You should be able to analyze all of them to determine that the second and third ways are better than the first.

The Easy and Inefficient Way

If you represent two numbers as strings with the same number of characters, then comparing the strings is equivalent to comparing the numbers. So, the easiest thing to do is to try all combinations of i and j, swapping their digits, and returning the smallest number, while discarding numbers that begin with zero.

Enumerating all combinations of i and j takes roughly n2 operations, so this is not a good way to solve the problem in general. However, the topcoder constraints limit the string to 50 characters, and 50*50 is a pretty small number, so it works easily.

In, you can do the first 58 problems in under a second. The 59th, and last problem is roughly 9600 characters, and took my program roughly 9 seconds to do. You can test your solution on the first 58 problems with:

UNIX> head -n 58 | sh

The Harder and Efficient Way

To solve this problem more efficiently, think about conditions when you swap digits. In particular: If you're still stuck, I'll answer those questions:

This gives us a strategy for solving the problem, but if we program it in the most natural way, the program still runs in n2 operations. The most "natural" way is to work from the definition: start i at zero and have it go to the end of the string. For each value of i, you look at each value to the right of i and find the minimum, rightmost value (when i equals zero, you exclude zero). If this value is less than the digit in i, then you swap those digits and return. If the value is greater than or equal to the digit in i, then you increment i and try again.

If you think about it, when your input is a non-decreasing sequence of digits, this technique still uses n2 operations. Why? Because at iteration i, you look at all of the digits greater than i, and there are n-i of those (where the number has n digits).

How can we fix this? By thinking clearly and organizing our code so that we are not doing unnecessary, nested for loops. Interestingly, I solved it in one way, and Allen McBride (who TA's the class many years ago) solved it another. I'll describe both. Allen's solution is better.

The Plank Solution

The Plank solution relies on the fact that there are only ten digits in our number. So, let's keep a vector V with ten elements. V[i] holds the rightmost index of digit 'i' in the string. If digit 'i' doesn't occur, then element i is -1. In example 1, where the string is "93561", the vector is

i 0 1 2 3 4 5 6 7 8 9
v[i] -1 4 -1 1 -1 2 3 -1 -1 0

Now, you run through the string as before, but finding the minimum value to the right of digit i simply requires you to run through the vector, whose size is limited to ten elements. Instead of requiring n2 operations in the worst case, it is linear!

When I ran on this solution, the entire thing ran in 0.16 seconds.

The McBride Solution

In this solution, you work from the end of the string to the beginning of the string, and you maintain three variables:

Let's give an example that is similar to example 3: "10423". You should be able to see that the answer will be "10243". Here are the variables as you run through the string from right to left:

i min_digit min_nonzero lpos
Start '0'+10 '0'+10 -1
4 '3' '3' -1
3 '2' '2' -1
2 '2' '2' 2
1 '0' '2' 2
0 '0' '1' 2

In that last iteration, we don't set lpos equal to zero because the digit must be less then min_nonzero when lpos equals zero.

Now, when you're done, you are going to swap the digit at index lpos. Obviously, if lpos equals -1, you simply return the original string. If lpos equals zero, you want to find the rightmost minimum, non-zero digit, and swap with that. Otherwise, you want to find the rightmost, minimum digit that is to the right of lpos, and swap with that. In the example above, we start at index 2 (whose digit is '4'), and find the rightmost, minimum digit to the right of it -- that's the '2', which is at index 3. Swap the '2' and the '4', and you have your answer.

This solution is better than the Plank solution, because it doesn't require you to traverse the entire vector, if the problem structure is good, and it doesn't depend on a vector of digits like the Plank solution. Very nice, Allen!

When I programmed this up, it also ran in 0.16 seconds.

When you program up one of these two solutions, test it with

Running Times

Let's make this general. We'll define the following quantities: The easy an inefficient solution: This is O(n2), plain and simple.

The Plank solution: I traverse the entire vector to create my digit vector v, whose size is d. So my solution is O(n + d). Why the distinction? Well, suppose I'm working in base 100, but my string has two digits. Then, the d term dominates.

The McBride solution: Determining min_digit and min_nonzero is O(m+z) -- that's the number of elements that you look at to find min_digit or min_nonzero, and as always, the addition operator in the big-O equation stands for "either-or, depending on which is bigger". Finding lpos is O(l). So the running time of the overall solution is O(m+z+l). You'll note, this can be O(n) in the worst case, but if I were to generate the input randomly, and n is large, then it would be O(d). Why? Because on average we would find the rightmost 0 and 1 within the last d characters, and lpos would be a very small number.