CS302 Lecture Notes - The Connected Component Program with Pointers


In the main set of lecture notes, I show how do to a DFS to determine connected components of an undirected graph. In that implementation, I use a Graph class, which has a vector of nodes as a member variable. The DFS is a member function, and for that reason, it can access the Nodes vector. This simplifies our implementation somewhat, because the node adjacency lists can contain integers -- the indices of the nodes in the Nodes vector.

When I asked in class (this is 2016), whether I should implement the program with integers in the adjacency lists, or pointers, the class chose pointers. Nice. So, I have implemented this version in src/concomp-pointer.cpp. I include the implementation below with extensive inline comments to help you. In some respects, this is a simpler implementation, because it doesn't require a Graph class. This is because the pointers point directly to nodes, and don't require that the nodes be accessible as part of the Graph class.

/* The Node class:  

   Each node's adjacency list contains pointers to the nodes that are
   adjacent to them.  This is unlike the code in the main set of
   lecture notes, where the adjacency lists contains integers, and
   the integers are used as indices to a vector that contains all of
   the nodes. */

class Node {
  public:
    int id;
    int component;
    vector <Node *> Adj;
};

/* Because our adjacency list contains pointers, we don't need a Graph
   class.  We don't need to access the vector of nodes, because
   we access each node directly in the adjacency list with a pointer.
   Contrast this with the other code in the lecture notes, where
   DFS() is a method that is part of a Graph class, which is
   necessary because you need to use the indices to the vector of
   nodes in the graph class.

   Also, in this code, I print out the beginning of each DFS call, indenting
   it with two spaces for every nested level of recursion.  You should take a look
   at that printf call.  I include it here:

   printf("%*sDFS(%d,%d)\n", indent, "", n->id, cn);

   The asterisk says to read the field width from the next argument -- that means
   that that part of the output will be padded to "indent" spaces.  Following the 
   asterisk is an "s", which means to print out a string (C style).  
   I give it the empty string.  So, if, for example, "indent" is equal to 4, this
   will print out four spaces.
*/

void DFS(Node *n, int cn, int indent)
{
  size_t i;

  printf("%*sDFS(%d,%d)\n", indent, "", n->id, cn);
  if (n->component != -1) return;
  n->component = cn;
  for (i = 0; i < n->Adj.size(); i++) DFS(n->Adj[i], cn, indent+2);  
}

/* The main() routine -- this reads in the graph, and then calls the DFS
   on each node whose "component" field is -1.  At the end, it prints each
   node. */

int main()
{
  vector <Node *> Nodes;     /* These are the nodes.  Note that they are pointers. */
  int nn;
  int f, t;
  size_t i, j;
  string s;
  Node *nf, *nt;
  int cn;

  getline(cin, s);
  sscanf(s.c_str(), "NNODES %d", &nn);

  /* Because "Nodes" is a vector of pointers, for each of the i nodes, we need
     to allocate it with new.  We then set its id, and set its component number 
     to be negative one. */

  Nodes.resize(nn);
  for (i = 0; i < Nodes.size(); i++) {
    Nodes[i] = new Node;
    Nodes[i]->id = i;
    Nodes[i]->component = -1;
  }

  /* Read the edges.  We then put each node on the other's adjacency list.  Once
     again, the nodes are pointers, so it's cleaner to use the temporary variables
     "nf" and "nt", which are also pointers. */

  while (getline(cin, s)) {
    sscanf(s.c_str(), "EDGE %d %d", &f, &t);
    nf = Nodes[f];
    nt = Nodes[t];
    nf->Adj.push_back(nt);
    nt->Adj.push_back(nf);
  }

  /* Do the DFS()'s */

  cn = 0;
  for (i = 0; i < Nodes.size(); i++) {
    if (Nodes[i]->component == -1) {
      DFS(Nodes[i], cn, 0);
      cn++;
    }
  }

  /* Print out each node. */

  for (i = 0; i < Nodes.size(); i++) {
    nf = Nodes[i];
    printf("%2d: Component:%2d Edges:", nf->id, nf->component);
    for (j = 0; j < nf->Adj.size(); j++) {
      nt = nf->Adj[j];
      printf(" %d", nt->id);
    }
    printf("\n");
  }
  
  return 0;
}

This code is also a little different, because it prints out the beginning of each DFS() call, with indentation to illustrate the level of nesting. Read the inline comments to show how I did that using printf().

Here's example output with our two graph files:

UNIX> concomp-pointer < g1.txt
DFS(0,0)
DFS(1,1)
  DFS(3,1)
    DFS(5,1)
      DFS(3,1)
    DFS(1,1)
DFS(2,2)
DFS(4,3)
  DFS(9,3)
    DFS(4,3)
  DFS(6,3)
    DFS(4,3)
    DFS(8,3)
      DFS(6,3)
  DFS(7,3)
    DFS(4,3)
 0: Component: 0 Edges:
 1: Component: 1 Edges: 3
 2: Component: 2 Edges:
 3: Component: 1 Edges: 5 1
 4: Component: 3 Edges: 9 6 7
 5: Component: 1 Edges: 3
 6: Component: 3 Edges: 4 8
 7: Component: 3 Edges: 4
 8: Component: 3 Edges: 6
 9: Component: 3 Edges: 4
UNIX> concomp-pointer < g2.txt
DFS(0,0)
  DFS(3,0)
    DFS(7,0)
      DFS(3,0)
      DFS(2,0)
        DFS(1,0)
          DFS(2,0)
        DFS(7,0)
        DFS(9,0)
          DFS(5,0)
            DFS(9,0)
            DFS(8,0)
              DFS(5,0)
              DFS(6,0)
                DFS(8,0)
            DFS(7,0)
          DFS(2,0)
      DFS(5,0)
    DFS(0,0)
DFS(4,1)
 0: Component: 0 Edges: 3
 1: Component: 0 Edges: 2
 2: Component: 0 Edges: 1 7 9
 3: Component: 0 Edges: 7 0
 4: Component: 1 Edges:
 5: Component: 0 Edges: 9 8 7
 6: Component: 0 Edges: 8
 7: Component: 0 Edges: 3 2 5
 8: Component: 0 Edges: 5 6
 9: Component: 0 Edges: 5 2
UNIX>