SRM 621, D1, 500-Pointer (TreesAnalysis)
James S. Plank
Tue Nov 6 00:00:04 EST 2018
This is a problem that leverages topological sort very nicely. It would be an easier
problem were it not for the constraints. With 4,000 nodes, you can barely get away
with a quadratic solution. I know, because I tried and failed. This was my
solution:
- For each edge, calculate the two sets of nodes using DFS. That's already O(n2),
but it's fast enough.
- Store the sets using bits -- with 4000 nodes, you can store each node's sets in
4000/64 = 62.5, or 63 long long's.
- For each pair of nodes, calculate the intersections (using AND's and NOT's), then
sum up their cardinalities. Unfortunately, that's O(n3).
- Square the biggest cardinality, and add that to the total.
Too slow.
So, I read the topcoder analysis, and their solution is really nice.
Step 1 - Turn the trees into DAG's, where each node has one edge, to its parent
This can be done with a DFS, the result of which is a vector P of integers, where
P[i] is the parent of node i. You can do anything you want with the parent
of the root. In these examples, I'll have it be -1.
We'll use Example 1 to illustrate. Here is how my DFS defined the two trees (edges
going from the lower nodes to the higher nodes):
|
|
|
Parent Pointers
Node 0 1 2 3 4
T1 -1 2 0 0 1
T2 -1 0 4 4 1
|
Step 2 - Derive a topological ordering of the nodes
You can do this using the classic algorithm for topological sort, or you can just use the
same DFS as above. A postorder traversal of the nodes yields a topological ordering,
where a node appears after all of its children in the ordering. Here are orderings
for the two trees above:
T1: 4 1 2 3 0
T2: 2 3 4 1 0
Step 3 - Determine the two sets for each edge of the first graph, using topological sort
I represented each set with a vector of long long's. If an element's bit was zero, it was in
set A, and if was 1, it was in set B.
You can do this by running through the nodes in
toplogical order. Here's how it works. First set all nodes' vectors to zeros. Then,
run through the topological order. For each node i, you set bit i in the vector.
The resulting set will define the two sets when you remove the edge from node i
to its parent.
Next, you OR node i's vector with its parents.
That's how you get the sets right for
the parents.
Below, I'll go through the whole process for T1. For each node that we process (which
defines an edge that separates the tree into two components), I'll draw the trees, and identify the
two sets as yellow (set A) and green (set B).
I'll include the bit representation of the sets, both
as bits and as hex.
I'm also coloring the sets when they change -- red means that that is the final set specification
for the node, and blue means that this was the parent of a node that changed.
Node |
Edge |
Graph |
Set for 0 |
Set for 1 |
Set for 2 |
Set for 3 |
Set for 4 |
Start |
Start |
|
00000 0x00 |
00000 0x00 |
00000 0x00 |
00000 0x00 |
00000 0x00 |
4 |
4->1 |
|
00000 0x00 |
10000 0x10 |
00000 0x00 |
00000 0x00 |
10000 0x10 |
1 |
1->2 |
|
00000 0x00 |
10010 0x12 |
10010 0x12 |
00000 0x00 |
10000 0x10 |
2 |
2->0 |
|
10110 0x16 |
10010 0x12 |
10110 0x16 |
00000 0x00 |
10000 0x10 |
3 |
3->0 |
|
11110 0x1e |
10010 0x12 |
10110 0x16 |
01000 0x8 |
10000 0x10 |
You'll note, we don't process node 0, because it's the root, and has no parent pointers.
Step 4 - For each node/edge in T1, do a topological sort on T2 to count up the similarities
Ok, now what we're going to do is once again traverse each node/edge of T1.
We know the two sets that result when you remove the edge, because we determined them in
Step 3 above. Call the yellow set A and the green set B.
Now, we're going to traverse each node/edge of T2, also in topological order, and we're
going to partition the nodes into two sets, C and D. Before we begin, all
of the nodes are in C.
We store two values during the traversal:
- |A| is the number of nodes in set A.
- |B| is the number of nodes in set B.
Now, for each node/edge in T2, we are going to maintain two values:
- AD[n] -- This is the number of nodes that belong in sets A and D,
when the edge from node n is removed from T2.
- BD[n] -- This is the number of nodes that belong in sets B and D,
when the edge from node n is removed from T2.
We'll start with all of these being zero. However, when we process an node/edge, then we will:
T1's Node |
T1's Edge |
T1's Graph |
The set |
AC |
BC |
T2's Node |
T2's Edge |
T2's Graph |
AD[n] |
BD[n] |
|AIC| |
|AID| |
|BIC| |
|BID| |
Similarity |
4 |
4->1 |
|
10000 0x10 |
4 |
1 |
2 |
2->4 |
|
1 |
0 |
3 |
1 |
1 |
0 |
9 |
4 |
4->1 |
|
10000 0x10 |
4 |
1 |
3 |
3->4 |
|
1 |
0 |
3 |
1 |
1 |
0 |
9 |
4 |
4->1 |
|
10000 0x10 |
4 |
1 |
4 |
4->1 |
|
2 |
1 |
2 |
2 |
0 |
1 |
4 |
4 |
4->1 |
|
10000 0x10 |
4 |
1 |
1 |
1->0 |
|
3 |
1 |
1 |
3 |
0 |
1 |
9 |
1 |
1->2 |
|
10010 0x12 |
3 |
2 |
2 |
2->4 |
|
1 |
0 |
2 |
1 |
2 |
0 |
4 |
1 |
1->2 |
|
10010 0x12 |
3 |
2 |
3 |
3->4 |
|
1 |
0 |
2 |
1 |
2 |
0 |
4 |
1 |
1->2 |
|
10010 0x12 |
3 |
2 |
4 |
4->1 |
|
2 |
1 |
1 |
2 |
1 |
1 |
4 |
1 |
1->2 |
|
10010 0x12 |
3 |
2 |
1 |
1->0 |
|
2 |
2 |
1 |
2 |
0 |
2 |
4 |
And so on.