Example program #5: PageNumbers

Latest Revision: Wed Nov 16 09:39:48 EST 2016

This is from the 2009 Topcoder Algorithm Qualifier, Round 2, 500-pointer.

Problem Statement.

This was a tricky one, that only 20 percent of the coders got in the qualification tournament. It shows the power of thinking recursively, and of dynamic programming.

The simplest solution would be to loop from 1 to *N*, call
**sprintf()** and add up the digits. The problem with that solution is that
it is linear in *N*, and *N* can be as big as 1,000,000,000. So that
won't do.

What we're going to do is structure our recursion on the first digit of the number.
Our base case will be when *N* is a one-digit number. We can solve that case
directly. In fact, let's do that first.

As with the other problems in these lecture notes, I solve this incrementally with a number
of programs. The first is
**page-numbers-1.cpp**. I have a **main()** in this
file which reads *N* from the command line, and then uses the Topcoder structure to
call **PageNumbers::getCounts()**:

#include <string> #include <vector> #include <iostream> #include <cstdio> #include <cstdlib> using namespace std; class PageNumbers { public: vector <int> getCounts(int N); }; vector <int> PageNumbers::getCounts(int N) { vector <int> rv; int i; /* We're only solving the base case -- when N is a one-digit number. */ if (N < 10) { rv.resize(10, 0); for (i = 1; i <= N; i++) rv[i] = 1; return rv; } printf("We haven't solved the problem for N >= 10 yet.\n"); return rv; } /* Our main() reads N from the command line, and calls getCounts(). It prints the return vector. */ int main(int argc, char **argv) { int i; PageNumbers c; int N, d; vector <int> retval; if (argc != 2) { fprintf(stderr, "usage: PageNumbers N\n"); exit(1); } N = atoi(argv[1]); retval = c.getCounts(N); if (retval.size() == 0) exit(0); printf("Answer:"); for (i = 0; i < retval.size(); i++) printf(" %d", retval[i]); cout << endl; exit(0); } |

We test it out, and it looks good. That's good for our self-esteem:

UNIX>Now, supposemake page-numbers-1g++ -o page-numbers-1 page-numbers-1.cpp UNIX>page-numbers-1 0Answer: 0 0 0 0 0 0 0 0 0 0 UNIX>page-numbers-1 3Answer: 0 1 1 1 0 0 0 0 0 0 UNIX>page-numbers-1 9Answer: 0 1 1 1 1 1 1 1 1 1 UNIX>page-numbers-1 10We haven't solved the problem for N >= 10 yet. UNIX>

Let's give an example. Suppose *N* is 3659. Then **first_digit** is 3, **digits** is 4, **middle_number** is 3000, and **remainder** is 659.

Let's write the code to set these variables. That is in
**page-numbers-2.cpp**. Here's **getCounts()**.

vector <int> PageNumbers::getCounts(int N) { vector <int> rv; int i; char buf[20]; string n_str; int first_digit; /* The first digit of N. */ int digits; /* The number of digits in N. */ int middle_number; /* This number has the same first digit of N, followed by zeros. */ int remainder; /* This is (N-middle_number). */ /* Base case -- when N is a single-digit number. */ if (N < 10) { rv.resize(10, 0); for (i = 1; i <= N; i++) rv[i] = 1; return rv; } /* Convert N to a string using sprintf(). */ sprintf(buf, "%d", N); n_str = buf; /* Now calculate first_digit, digits, middle_number and remainder. */ first_digit = n_str[0] - '0'; digits = n_str.size(); for (i = 1; i < digits; i++) n_str[i] = '0'; middle_number = atoi(n_str.c_str()); remainder = N - middle_number; /* Print them out and exit. */ printf("First digit = %10d\n", first_digit); printf("Digits = %10d\n", digits); printf("Middle number = %10d\n", middle_number); printf("Remainder = %10d\n", remainder); return rv; } |

As you can see, I used **sprintf()** to convert *N* to a string, and then
**atoi** to create **middle_number** from the string. You could use stringstreams
to do this, or you could use div and mod. It's up to you.

Again, we test it and see that all is as it should be:

UNIX>Now, we're going to split our problem into three cases:make page-numbers-2g++ -o page-numbers-2 page-numbers-2.cpp UNIX>page-numbers-2 3659First digit = 3 Digits = 4 Middle number = 3000 Remainder = 659 Answer: UNIX>page-numbers-2 987654321First digit = 9 Digits = 9 Middle number = 900000000 Remainder = 87654321 Answer: UNIX>page-numbers-2 10First digit = 1 Digits = 2 Middle number = 10 Remainder = 0 Answer: UNIX>

- Calculate the page numbers for pages from 1 to (
**middle_number**-1). - Calculate the page numbers for
**middle_number**. - Calculate the page numbers for page (
**middle_number**+1) to*N*.

The third one is a little more tricky, so let's solve the second one and test it.
That code is in
**page-numbers-3.cpp**

/* Calculate the answer for middle_number and return it. */ rv.resize(10, 0); rv[first_digit]++; for (i = 0; i < digits-1; i++) rv[0]++; return rv; } |

We test it, and all looks good:

UNIX>Ok -- let's do the hard case -- solving the problem for the pages from (make page-numbers-3g++ -o page-numbers-3 page-numbers-3.cpp UNIX>page-numbers-3 3659First digit = 3 Digits = 4 Middle number = 3000 Remainder = 659 Answer: 3 0 0 1 0 0 0 0 0 0 UNIX>page-numbers-3 987654321First digit = 9 Digits = 9 Middle number = 900000000 Remainder = 87654321 Answer: 8 0 0 0 0 0 0 0 0 1 UNIX>page-numbers-3 10First digit = 1 Digits = 2 Middle number = 10 Remainder = 0 Answer: 1 1 0 0 0 0 0 0 0 0 UNIX>

These make up a subproblem which is almost like **getCounts()**. You
want to calculate digits for all of the pages from 1 to **remainder**, however
you need to include leading zeros. Think about the case where *N* is 1002.
Then, **remainder** is 2, and when you want to solve the subproblem from
pages 1001 to 1002. You'll do that by adding two '1' digits, and then you'd like
to call **getCount(2)**. However, you need those four zeros, and **getCount(2)**
is not going to calculate them.

What you do is use the following observation:
You know exactly how many digits are going to be in pages
(**middle_number**+1) to *N*: (**remainder** * **digits**).
We've already demonstrated that the **remainder** of these are equal to
**first_digit**. To calculate the rest, we can call **getCount(remainder)**.
The return vector of that call will have all of the digits except for those leading
zeros. Since you know how many total digits there should be, you know that the
ones not calculated by the recursive **getCount(remainder)** call must be zeros.
That lets you solve the problem.

Let's use 3659 as an example. We're going to solve the three subproblems as follows:

- We'll call
**getCounts(2999)**to get all of the page numbers from 1 to (**middle_number**-1). - We'll add one to
**rv[3]**and three zeros to**rv[0]**to account for**middle_number**. - We'll add 659 to
**rv[3]**, and then we'll call**getCounts(659)**recursively. We'll add up the digits in that return vector and subtract that number from (3*659). That is the number of extra zeros that we add to**rv[0]**.

vector <int> PageNumbers::getCounts(int N) { vector <int> rv, rv2; /* I've added rv2 for the recursion. */ ... /* Make the first recursive call to middle_number-1 */ rv = getCounts(middle_number-1); /* Add in the answer for middle_number. */ rv[first_digit]++; for (i = 0; i < digits-1; i++) rv[0]++; /* Add the first digit of (middle_number+1) to N: */ rv[first_digit] += remainder; /* Now, call this recursively on remainder, and count up how many digits that is. Subtract this from (digits-1)*remainder to get the number of leading zeros that you're missing. Then add everything to the final return value. */ rv2 = getCounts(remainder); d = 0; for (i = 0; i < rv2.size(); i++) d += rv2[i]; rv[0] += ((digits-1)*remainder - d); for (i = 0; i < rv2.size(); i++) rv[i] += rv2[i]; return rv; } |

We'll test it on examples 1-3 from Topcoder. Example 3, where *N* equals 999, makes
a ton of recursive calls, so I just print out the last line, to confirm that we have
the right answer:

UNIX>Looks like we have to memoize. This turns out to be really easy, becausemake page-numbers-4g++ -o page-numbers-4 page-numbers-4.cpp UNIX>page-numbers-4 11First digit = 1 Digits = 2 Middle number = 10 Remainder = 1 Answer: 1 4 1 1 1 1 1 1 1 1 UNIX>page-numbers-4 19First digit = 1 Digits = 2 Middle number = 10 Remainder = 9 Answer: 1 12 2 2 2 2 2 2 2 2 UNIX>page-numbers-4 999 | tail -n 1Answer: 189 300 300 300 300 300 300 300 300 300 UNIX>page-numbers-4 999 | wc397 1397 10740 UNIX>

/* Add a cache to PageNumbers */ class PageNumbers { public: vector <int> getCounts(int N); map < int, vector <int> > Cache; }; vector <int> PageNumbers::getCounts(int N) { [... Variable declarations] /* Base case -- when N is a single-digit number. */ if (N < 10) { rv.resize(10, 0); for (i = 1; i <= N; i++) rv[i] = 1; return rv; } /* Get the answer from the Cache if it's there. */ if (Cache.find(N) != Cache.end()) return Cache[N]; [... The rest of the code] /* Insert the answer into the cache before returning. */ Cache[N] = rv; return rv; } } |

UNIX>make page-numbers-5g++ -o page-numbers-5 page-numbers-5.cpp UNIX>page-numbers-5 11First digit = 1 Digits = 2 Middle number = 10 Remainder = 1 Answer: 1 4 1 1 1 1 1 1 1 1 UNIX>page-numbers-5 19First digit = 1 Digits = 2 Middle number = 10 Remainder = 9 Answer: 1 12 2 2 2 2 2 2 2 2 UNIX>page-numbers-5 999 | tail -n 1Answer: 189 300 300 300 300 300 300 300 300 300 UNIX>page-numbers-5 999 | wc73 263 1992 UNIX>page-numbers-5 543212345 | tail -n 1Answer: 429904664 541008121 540917467 540117067 533117017 473117011 429904664 429904664 429904664 429904664 UNIX>page-numbers-5 543212345 | wc301 1061 8208 UNIX>