CS494 Lab 6

The lab follows the topcoder writeup:

SRM 720, D2, 1000-Pointer (RainbowGraph)

James S. Plank

Mon Dec 3 15:25:44 EST 2018

Grad students Clara Nguyen and Natalie Bogda wrote up a very nice presentation of this problem, with some interesting commentary. The web link is http://utk.claranguyen.me/talks.php?id=bitdp. That's not how I recommend to solve the problem, but it makes for some interesting reading!

In case topcoder's servers are down

Here is a summary of the problem: Examples:

0 {0,0,0,1,1,1,2,2,2} {0,1,2,3,4,5,6,7,8,0,3,6} {1,2,0,4,5,3,7,8,6,3,6,0} 0
1 {0,0,0,1,1,1,2,2,2} {0,1,2,3,4,5,6,7,8,0,4,8} {1,2,0,4,5,3,7,8,6,3,7,2} 24
2 {0,3,9,8,6,4} {0,0,0,0,0,1,1,1,1,2,2,2,3,3,4} {1,2,3,4,5,2,3,4,5,3,4,5,4,5,5} 720
3 {0,0,0,0,3,3,3,6,6,9} {9,9,9,9,9,9,9,9,9,7,7,7,7,7,7,7,4,4,4,4,0,1,2,4,5,8} {0,1,2,3,4,5,6,7,8,0,1,2,3,4,5,6,0,1,2,3,1,2,3,5,6,7} 64
4 {3,1,4,1,5,9,2,6,5,3,5} {1} {2} 0
5 Too big. See main.cpp 983979105


This is a wonderful problem -- a mixture of DFS and dynamic programming. You have to program it carefully, or you won't get it in under the time limit!

Obviously, you are going to focus on the connected components, which are denoted by the node colors. Once a path reaches a node of one color, it must travel through every node in that color's connected component, before it can go to a node with another color. The constraints help you: There is a maximum of ten components, and each component has a maximum of ten nodes.

An Example

I put this into main.cpp as example 7:

color = { 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2 }
a =     { 0, 1, 2, 3, 3, 6, 6, 4, 7, 7, 10, 10, 8, 2, 6 }
b =     { 1, 2, 0, 4, 5, 4, 5, 5, 8, 9,  8,  9, 9, 3, 7 }
I've colored the inter-component edges black, and the intra-component edges the same color as the component nodes.

Let's logic our way through the answer. It should be clear that the only paths are going to go through the green-red-blue components in that order, or blue-red-green. Let's focus on the paths through the components, when they are going through green-red-blue.

That means that there are 2*2*4 = 16 paths that go through green-red-blue. If you think about it, the paths that go through blue-red-green are the exact same paths as green-red-blue, just in reverse. So the answer is 32 paths.

The solution, part 1

I solved this in two parts. In the first part, I created a two-dimensional array NP. NP[i][j] is non-zero only if nodes i and j are in the same connected component. It contains the number of paths from node i to node j, where each path must contain every node in the component.

In our example above, we'll have the following:

For a given node i, you can calculate NP[i][j] for all j using a an enhanced DFS which travels every intracomponent path from node i. While you're traveling each path, you keep track of the path length, and if you reach a node j and the path length contains every node in the component, you increment NP[i][j].

Let me run through an example, where the starting node is node 3. When I did this, I had a V (visited) field for each node, and a variable NIP (nodes in the path). I set them all to zero. Here is what happens when I call DFS(3). I call it an "enhanced" DFS, because when you're done with a node, you set V back to zero, so it can participate in more paths.

DFS(3):        NIP:0 -- Begin.  Increment NIP and Set V[3] to 1.
DFS(3):        NIP:1 -- Will call DFS on: 4 5
DFS(3):        NIP:1 -- Calling DFS(4)
  DFS(4):      NIP:1 -- Begin.  Increment NIP and Set V[4] to 1.
  DFS(4):      NIP:2 -- Will call DFS on: 6 5
  DFS(4):      NIP:2 -- Calling DFS(6)
    DFS(6):    NIP:2 -- Begin.  Increment NIP and Set V[6] to 1.
    DFS(6):    NIP:3 -- Will call DFS on: 5
    DFS(6):    NIP:3 -- Calling DFS(5)
      DFS(5):  NIP:3 -- Begin.  Increment NIP and Set V[5] to 1.
      DFS(5):  NIP:4 -- Setting NP[3][5] to 1.
      DFS(5):  NIP:3 -- Done.  Setting V[5] = 0
    DFS(6):    NIP:2 -- Done.  Setting V[6] = 0
  DFS(4):      NIP:2 -- Calling DFS(5)
    DFS(5):    NIP:2 -- Begin.  Increment NIP and Set V[5] to 1.
    DFS(5):    NIP:3 -- Will call DFS on: 6
    DFS(5):    NIP:3 -- Calling DFS(6)
      DFS(6):  NIP:3 -- Begin.  Increment NIP and Set V[6] to 1.
      DFS(6):  NIP:4 -- Setting NP[3][6] to 1.
      DFS(6):  NIP:3 -- Done.  Setting V[6] = 0
    DFS(5):    NIP:2 -- Done.  Setting V[5] = 0
  DFS(4):      NIP:1 -- Done.  Setting V[4] = 0
DFS(3):        NIP:1 -- Calling DFS(5)
  DFS(5):      NIP:1 -- Begin.  Increment NIP and Set V[5] to 1.
  DFS(5):      NIP:2 -- Will call DFS on: 6 4
  DFS(5):      NIP:2 -- Calling DFS(6)
    DFS(6):    NIP:2 -- Begin.  Increment NIP and Set V[6] to 1.
    DFS(6):    NIP:3 -- Will call DFS on: 4
    DFS(6):    NIP:3 -- Calling DFS(4)
      DFS(4):  NIP:3 -- Begin.  Increment NIP and Set V[4] to 1.
      DFS(4):  NIP:4 -- Setting NP[3][4] to 1.
      DFS(4):  NIP:3 -- Done.  Setting V[4] = 0
    DFS(6):    NIP:2 -- Done.  Setting V[6] = 0
  DFS(5):      NIP:2 -- Calling DFS(4)
    DFS(4):    NIP:2 -- Begin.  Increment NIP and Set V[4] to 1.
    DFS(4):    NIP:3 -- Will call DFS on: 6
    DFS(4):    NIP:3 -- Calling DFS(6)
      DFS(6):  NIP:3 -- Begin.  Increment NIP and Set V[6] to 1.
      DFS(6):  NIP:4 -- Setting NP[3][6] to 2.
      DFS(6):  NIP:3 -- Done.  Setting V[6] = 0
    DFS(4):    NIP:2 -- Done.  Setting V[4] = 0
  DFS(5):      NIP:1 -- Done.  Setting V[5] = 0
DFS(3):        NIP:0 -- Done.  Setting V[3] = 0
This is most definitely an expensive algorithm. Think about it -- if there are ten nodes in a component, and the nodes are completely connected, then NP[i][j] will equal 8! for each i and j. That is because there is a path for every permutation of the other nodes. Of course, 8! equals 40,320, which is definitely doable in the universe of topcoder.

The solution, part 2.

The second part uses dynamic programming. Define the following procedure:

long long NumWalks(int starting_node, int remaining_components);

This is going to return the number of legal paths that start with starting_node, go through starting_node's component, and then go through remaining_components. remaining_components is an integer that stores a set of components using bit arithmetic. It should not include starting_node's component. You are going to sum up NumWalks(n, s) for every node n, and every set s composed of all of the components except for n's component.

To implement NumWalks(n, s), what you do is look at every node m in node n's component. If NP[n][m] is greater than zero, then you look at every node l which is connected to m and in a component in s. You then call NumWalks(l, s - l's component). You will add the product of that and NP[n][m] to the return value for NumWalks(n, s).

You need a base case for this -- if s = {}, then NumWalks(n, s) is the sum of all NP[n][m].

Let's do an example -- we'll calculate NumWalks(0, { red, blue } ). There are only two values of NP[0][j] which is greater than zero -- NP[0][1] and NP[0][2] both equal one. First, consider NP[0][1]. There is no edge from 1 to the red or blue components. So, there are no paths involving NP[0][1].

Next, consider NP[0][2]. There is an edge from 2 to 3, so the return value for NumWalks(0, { red, blue } ) is going to equal NP[0][2] (which is one) times NumWalks(3, { blue } ). So let's focus on NumWalks(3, { blue } ):

Although NP[3][4] and NP[3][5] equal one, neither four nor five are connected to the blue component. So, the only value of NP that matters is NP[3][6], which equals 2. NumWalks(3, { blue } ) is going to equal NP[3][6] times NumWalks(7, { } ).

So, we focus on on NumWalks(7, { } ). This is the base case of the recursion -- it equals the sum of all NP[7][m]. This is 4. So NumWalks(3, { blue } ) equals 4*2 = 8. And NumWalks(0, { red, blue } ) equals 8*1 = 8.

This is dynamic programming, so you cache the return values of NumWalks(). The nodes are numbers from 0 to 9, and the component sets are numbers from 0 to 1023. So, your cache isn't that big -- best to make it a two-dimensional vector.

Below is the dynamic programming cache for example 7. Go ahead and walk through it.

nremaining_components (int) remaining_components (set)NumWalks(n, remaining_components)

CS494 Lab 6

You are only to hand in RG.cpp. You may not modify any of the other files in this lab.

Your job is to implement the RainbowGraph class in the file RG.cpp. The RainbowGraph class is defined in RG.h. You are not allowed to modify this file.

#include <string>
#include <vector>
#include <iostream>
#include <cstdio>
#include <cstdlib>
using namespace std;

class RainbowGraph {
    int countWays(vector <int> color, vector <int> a, vector <int> b);
    string Verbose;

    vector <int> Color;              // This is a copy of the input parameter "color".
    vector < vector <int> > Same;    // Adjacency lists of intracomponent edges.
    vector < vector <int> > Diff;    // Adjacency lists of intercomponent edges.
    vector < vector <int> > CNodes;  // Cnodes[i] contains all nodes whose color is i.
    vector < vector <int> > NP;      // NP[i][j] = number of paths from i to j that go
                                     // through all of the nodes in the component.

    vector <int> V;                  // The visited vector for the DFS.
    int NIP;                         // During the DFS, this is the number of nodes in the current path.
    int Source;                      // This is the initial node for each DFS call.
    int Target;                      // The size of Source's component: Cnodes[Source].size()

    vector < vector <long long > > Cache;  // The DP cache.

    void CountPaths(int n);          // This is the extended DFS.  Set Source, Target, V and NIP
                                     // before you call CountPaths(Source) to set NP[Source][j].

    long long NumWalks(int node, int setid);   // Number of walks starting at node node that 
                                               // still need to go through the nodes in setid.

You should have RG.cpp include "RB.h", and then implement countWays() as I have described above. Besides countWays() and Verbose, you don't have to implement or use any of the member variables or methods in this class. They are the ones that I used, though, and they are all that you need. You are not allowed to add things to this class.

Besides implementing countWays() so that it works correctly, you should also implement the following inside countWays():

You can implement other functionality for debugging, but I'll only test you on those two, and that you return the proper answer. My code prints out the DFS if the verbose string contains 'D'. You don't have to do that (and you may not want to, because it may slow your code down too much).

Testing with RG-Tester.cpp, and with Grade-Timer.sh

The makefile will make two executables: For grading, we're only going to use your RG-Tester. The gradescript tests the 'A' 'N' and 'C' verbose flags of RG-Tester. Additionally, there is a shell script in the lab directory called Grade-Timer.sh. It times all of the gradescript examples whose numbers are one, mod three. Your RG-Tester needs to complete each example in under 2 seconds (mine works in under .75 seconds for each). Don't try this on a heavily loaded machine.

I have examples 1-7 and example 11 as files in the lab directory (example 11 is exb.txt):

UNIX> RG-Tester < ex7.txt
UNIX> RG-Tester N < ex7.txt
NP[0][1] = 1
NP[0][2] = 1
NP[1][0] = 1
NP[1][2] = 1
NP[2][0] = 1
NP[2][1] = 1
NP[3][4] = 1
NP[3][5] = 1
NP[3][6] = 2
NP[4][3] = 1
NP[4][6] = 1
NP[5][3] = 1
NP[5][6] = 1
NP[6][3] = 2
NP[6][4] = 1
NP[6][5] = 1
NP[7][8] = 1
NP[7][9] = 1
NP[7][10] = 2
NP[8][7] = 1
NP[8][10] = 1
NP[9][7] = 1
NP[9][10] = 1
NP[10][7] = 2
NP[10][8] = 1
NP[10][9] = 1
UNIX> RG-Tester C < ex7.txt
Cache[0][0x6] = 8
Cache[1][0x6] = 8
Cache[2][0x0] = 2
Cache[2][0x4] = 0
Cache[2][0x6] = 0
Cache[3][0x4] = 8
Cache[3][0x5] = 0
Cache[4][0x5] = 0
Cache[5][0x5] = 0
Cache[6][0x1] = 4
Cache[6][0x5] = 0
Cache[7][0x0] = 4
Cache[7][0x1] = 0
Cache[7][0x3] = 0
Cache[8][0x3] = 4
Cache[9][0x3] = 4
Cache[10][0x3] = 8

Four bullets of advice