Inexact Matrix Transformations

The analysis is given for the Cayley transform only, though shift-and-invert could be used as well. The reasons are threefold. First, the shift-and-invert Arnoldi method and the Cayley transform Arnoldi method produce the same Ritz vectors. Second, it makes a link with the (Jacobi) Davidson methods easier to establish. Third, the Cayley transform leads to a more natural approach to the problem.

In the Arnoldi method applied to the Cayley transform,
, we must
solve a
sequence of linear systems

for , where is the residual and the last Krylov basis vector. The result is orthonormalized against into . One could also apply the Cayley transform to another vector in the Krylov space rather than to . In this case, we solve the linear system

where is chosen as follows: it may be a Ritz vector (Jacobi-Davidson) or the last basis vector (Arnoldi) or the continuation vector (rational Krylov). The residual norm is a measure of the backward error since we can rewrite (11.2) as

When a direct method is used, it typically happens that is a modest multiple of machine precision. We say that the direct method computes a backward stable solution of the linear system. Since is very small, we talk about an exact eigenvalue solver. Obtaining a small residual norm by an iterative method is also possible but is usually very expensive. In this section, we study the situation where is large. In order to give an indication of what we mean by large, a few words about iterative linear system solvers are needed. A linear system is said to be solved with a

By putting all the for
together in
,
in and in , we obtain

This is the relation that we should obtain after steps with the ``inexact'' Cayley transform

Note that for a fixed and , depends on and the way each of the linear systems (11.2) are solved. When (11.2) is solved by a preconditioner or a stationary linear system solver, i.e., for , then is independent of and and so is .

Eigenpairs are computed from . In Arnoldi's method, this happens by the Hessenberg matrix that arises from the orthogonalization of ; in the Davidson method, one uses the projection with and , e.g., .

contains the eigenvectors associated with the well-separated extreme eigenvalues of . This is a perturbed , so the relation with is partially lost, and accurately computing eigenpairs of may be difficult. To see which parameters play a role in this perturbation, from Theorems 4.3 and 4.4 in [323], it can be shown that if has distinct eigenvalues, then for each eigenpair of there is an eigenpair of such that

where . These inequalities show that if and/or are small, and correspond very well. In this case, can be computed via eigenpairs of . Since , , is reduced to .

This section can be concluded as follows.

- The inexact Cayley transform behaves like the exact one.
This implies that eigenvalues near the pole are still targeted.
- However, the eigenvalues are perturbed.
The level of perturbation depends on .
For any , the transformation is able to compute eigenvalues near
accurately.
In practice, this means that only one (simple) eigenvalue can be computed
very accurately.
It also implies that needs to be updated from time to time, since
is unknown.
The other eigenvalues are only computed approximately depending on the
error level .
- The sequence (11.2) for defines a Krylov subspace .

The remaining question is now how to compute the eigenpairs of or how to exploit them for computing eigenpairs of . In §11.2.3 and §11.2.4, we discuss the Rayleigh-Ritz technique; i.e., eigenpairs are computed from the orthogonal projection of on the . In §11.2.7, the eigenpairs are computed from the eigenpairs of directly by use of the rational Krylov recurrence relation. §11.2.6 presents a Lanczos algorithm that uses the recurrence relation for the eigenvectors and the Rayleigh-Ritz projection for the eigenvalues.

Note that and cannot be chosen too far away from each other. Suppose the eigenvalue is wanted. From the conclusion given above, the convergence is faster when is chosen close to , and the computed eigenvalue can only be accurate when is close to . In theory, one could select , which is usually used in the Davidson method. Since in that case, there is a high risk of stagnation in the latter method. A robust selection is used in the Jacobi-Davidson method, which we discuss in §11.2.5.