As indicated by values in Tables 8 through
10,
, , and are very large for the
hypertext matrices in their natural ordering.
Values in Table 8 show that
is
substantially reduced for all orderings with the largest
envelope
reduction obtained by the Fiedler vector approach discussed in
Section 3.4. With respect to , however,
the RCM ordering achieves the greatest bandwidth reduction
(see Table 9). Notice that for some matrices
(` CCE-A`, ` NHSE`), the
value of shown in Table 10 is
significantly lower than that of . This is due to the
clustering of nonzeros in a * narrow* band but the band itself
is significantly displaced from the * diagonal*. Also,
the orderings using Correspondence Analysis without distances
(see Section 3.5) tend to produce 's more similar to those
of RCM than the Fiedler approach.

Table 11 illustrates the effects of choosing
different pairs of principal axes for Correspondence
Analysis with distances (CACS). Since
the **8**-largest generalized singular values
(see Equation 9) for these matrices
were all approximately equal to **1** (i.e., form
a cluster of generalized singular values near **1**),
it is not clear which pairs
of principal axes best explains the variation in Equation
(11). By selecting the **10**-th pair (or **10**-th
largest) of principal axes, a reduction in
of
**43**% and **17**% can be obtained for ` MAN1` and
` NHSE`, respectively. CACS(10) also achieves an average
reduction of **25**% for and **29**% for for
these matrices.

Figures 3 through 5
illustrate the reorderings obtained for a few of the sample
hypertext matrices discussed in Section 3.2. Of particular
interest is the similarity of reorderings produced by the pairs
(RCM, CANC(1)) and (Fiedler, CACS(2)) for the ` MAN1`
and ` MAN2` matrices. For the other two matrices
(` CCE-A`, ` NHSE`) whose average number of documents
per link (see from Table 3) is smaller,
the similarities were not as prominent. As shown
in Figure 4, the * near-diagonal* clusterings
obtained are quite different. In particular, CANC(1) produces
a definite block-diagonal pattern of nonzeros but at the
expense of the largest bandwidths ( and )
among all reorderings of the four test matrices (see Tables
9 and 10). The reduction in bandwidth
achieved by CACS(10), i.e., using the **10**-th largest pair of
principal axes or left and right generalized singular vectors of
**A** in Equation (9), is illustrated in Figure
5 for the ` MAN1` matrix.

The proper selection of principal axes for Correspondence
Analysis with distances when the hypertext matrix
**A** has clustered generalized singular values ('s
from Equation (11)) is problematic. Nevertheless,
an iterative procedure which cycles through a subset of the largest
generalized singular vectors of **A** as computed by a Lanczos
or block Lanczos SVD method (see [B+93]) is
plausible. Future research in the use of principal axes for bandwidth
reduction in the presence of clustered spectra is warranted.

Michael W. Berry (berry@cs.utk.edu)

Mon Jan 29 14:30:24 EST 1996