next up previous
Next: Sample Term-by-Document Matrix Up: Background Previous: Background

2.1. Singular Value Decomposition

LSI [BDO95] exploits the factorization of the matrix A into the product of 3 matrices using the singular value decomposition (SVD). Given an matrix A, where and rank(A) = r, the singular value decomposition of A is defined as


where and . The first r columns of the orthogonal matrices U and V define the orthonormal eigenvectors associated with the r nonzero eigenvalues of and , respectively. The columns of U and V are referred to as the left and right singular vectors, respectively, and the singular values of A are the diagonal elements of or the nonnegative square roots of the n eigenvalues of [GL89].

As defined by Equation (3), the SVD is used to represent the original relationships among terms and documents as sets of linearly-independent vectors or factor values. Using k factors or the k-largest singular values and corresponding singular vectors one can encode (see [BDO95]) the original term-by-document matrix as a smaller (and more reliable) collection of vectors in k-space for conceptual query processing.

Michael W. Berry (
Mon Jan 29 14:30:24 EST 1996