As indicated by values in Tables 8 through 10, , , and are very large for the hypertext matrices in their natural ordering. Values in Table 8 show that is substantially reduced for all orderings with the largest envelope reduction obtained by the Fiedler vector approach discussed in Section 3.4. With respect to , however, the RCM ordering achieves the greatest bandwidth reduction (see Table 9). Notice that for some matrices ( CCE-A, NHSE), the value of shown in Table 10 is significantly lower than that of . This is due to the clustering of nonzeros in a narrow band but the band itself is significantly displaced from the diagonal. Also, the orderings using Correspondence Analysis without distances (see Section 3.5) tend to produce 's more similar to those of RCM than the Fiedler approach.
Table 11 illustrates the effects of choosing different pairs of principal axes for Correspondence Analysis with distances (CACS). Since the 8-largest generalized singular values (see Equation 9) for these matrices were all approximately equal to 1 (i.e., form a cluster of generalized singular values near 1), it is not clear which pairs of principal axes best explains the variation in Equation (11). By selecting the 10-th pair (or 10-th largest) of principal axes, a reduction in of 43% and 17% can be obtained for MAN1 and NHSE, respectively. CACS(10) also achieves an average reduction of 25% for and 29% for for these matrices.
Figures 3 through 5 illustrate the reorderings obtained for a few of the sample hypertext matrices discussed in Section 3.2. Of particular interest is the similarity of reorderings produced by the pairs (RCM, CANC(1)) and (Fiedler, CACS(2)) for the MAN1 and MAN2 matrices. For the other two matrices ( CCE-A, NHSE) whose average number of documents per link (see from Table 3) is smaller, the similarities were not as prominent. As shown in Figure 4, the near-diagonal clusterings obtained are quite different. In particular, CANC(1) produces a definite block-diagonal pattern of nonzeros but at the expense of the largest bandwidths ( and ) among all reorderings of the four test matrices (see Tables 9 and 10). The reduction in bandwidth achieved by CACS(10), i.e., using the 10-th largest pair of principal axes or left and right generalized singular vectors of A in Equation (9), is illustrated in Figure 5 for the MAN1 matrix.
The proper selection of principal axes for Correspondence Analysis with distances when the hypertext matrix A has clustered generalized singular values ('s from Equation (11)) is problematic. Nevertheless, an iterative procedure which cycles through a subset of the largest generalized singular vectors of A as computed by a Lanczos or block Lanczos SVD method (see [B+93]) is plausible. Future research in the use of principal axes for bandwidth reduction in the presence of clustered spectra is warranted.