We now consider the use of symbolic and spectral methods
to permute the term-document matrix defined in Equation (1).
The goal of such permutations is to make the detection of
document (or hypertext) clusters more immediate without having
to consider high-dimensional representations such as those
used in LSI. One desirable
form for the detection of such clusters is a * banded*
or nearly diagonal matrix in which all the nonzero values
(weighted term frequencies) fall within a band in each row and column.
Specifically, the nonzero values should all fall near the
line from the upper left to the lower right of the matrix. Such
a nonzero structure (or pattern) facilitates the identification
(demonstrated in Section 4.3)
of term or document clusters having similar meaning and context.

- 3.1. Metrics for Evaluating Term-Document Matrix Reorderings
- 3.2. Sample Hypertext Matrices
- 3.3. Symbolic Reordering Methods
- 3.4. Fiedler Ordering
- 3.5. Correspondence Analysis

Michael W. Berry (berry@cs.utk.edu)

Mon Jan 29 14:30:24 EST 1996