GMRES Algorithm - Random Notes Go Brrrrrrr

> [!tldr] GMRES Algorithm > The **GMRES algorithm** is an [[Iterative Algorithms|iterative algorithm]] for solving the linear system $Ax=b$. It approximates the solution $x$ with some $\hat{x} \in \mathcal{K}_{k}(A,b)$, the [[Krylov Subspace]] of order $k$, which is the number of iterations, ideally small). GMRES (generalized minimal residual) solves the problem $\hat{x}=\underset{x \in \mathcal{K}_{k}(A,b)}{\arg\min}~\| Ax-b \|_{2} $(instead of the exact problem $Ax=b$). If we have a basis $\{ q_{1},\dots,q_{k} \}$ of the Krylov subspace provided by the [[Krylov Subspace#Arnoldi Iterations|Arnoldi iterations]], we can define $y: x= Q_{k}y$ to be the coordinates of $x$ under that basis, and $\| Ax-b \|=\| AQ_{k}y-b \| =\| Q_{k+1}\tilde{H}_{k}y -b\| $where $Q_{k},\tilde{H}_{k}$ are defined as in the Arnoldi iterations, and we used the result $AQ_{k}=Q_{k+1}\tilde{H}_{k}$ from Arnoldi. Lastly, we emulate the trick [[QR Factorisation#Least Squares Problems|using QR for least squares problems]]: extend the basis of $Q_{k+1}$ and pad $\tilde{H}_{k}$ with zeros to get $\left\| [Q_{k+1}, Q_{k+1}^{\perp}]\begin{bmatrix}\tilde{H}_{k} \\ \mathbf{0} \end{bmatrix} y-b\right\| =\left\| \begin{bmatrix}\tilde{H}_{k} \\ \mathbf{0} \end{bmatrix} y- [Q_{k+1}, Q_{k+1}^{\perp}]^{T}b \right\|=\| \tilde{H}_{k}y-Q_{k+1}^{T}b \|+\| (Q_{k+1}^{\perp})^{T}b \|,$which is minimized by minimizing the first term, i.e. solving the coefficients $y$ of the linear least squares $\bbox[teal, 4px]{Q_{k+1}^{T}b=\| b \|e_{1} \sim \tilde{H}_{k}}.$ *This reduces the problem to a Hessenberg linear system, which can be efficiently solved using [[QR Factorisation#QR on Hessenberg Matrices|QR factorization]]* ($k+1$ Givens rotations, then a $O(k^{2})$ triangular solve). Note that this is desirable since $\tilde{H}_{k}$ is Hessenberg, and the feasible space is $y\in \mathbb{R}^{k}$ -- an arbitrary QR factorisation of $[b, Ab, \dots, A^{k}b]=QR$ will produce an awkward optimisation of $\| Rx-b \|$ over $x \in \mathcal{K}_{k}$. Therefore, the GMRES algorithm is: > [!algorithm] GMRES > Start with the basis $q_{1}\propto b$. > > For $k=1,\dots,n$: > - Compute the next Arnoldi iteration to get the $(k+1)$-element orthonormal basis $Q_{k+1}$ of $\mathcal{K}_{k+1}(A,b)$. This produces the Hessenberg matrix $\tilde{H}_{k}$ satisfying $AQ_{k}=Q_{k+1}\tilde{H}_{k}$. > - Solve the (unconstrained) least-squares problem $y_{k}:=\underset{y}{\arg\min}~\| \tilde{H}_{k}y-\| b \|e_{1} \|.$ > - The current iteration's optimal solution is $x_{k}:=Q_{k}y_{k}$. > - If the residual $Ax_{k}-b$ is small enough, terminate early. ## GMRES Convergence If we let the algorithm run indefinitely, we may reach $k=n$, so $\mathcal{K}_{n}(A,b)=\mathbb{R}^{n}$ always contains the optimal solution $x^{\ast}$. The Arnoldi iteration can also end early: > [!embed] > ![[Krylov Subspace#^ae9bca]] The residual is $\| A\hat{x}_{k}-b \| =\min_{x \in \mathcal{K}_{k}}\| Ax-b \|=\min_{p \in \Pi_{k-1}}\| Ap(A)b-b \|=\min_{p \in \Pi_{k-1}} \| (Ap(A)-I) b\| $this corresponds to the polynomial $x \mapsto xp(x)-1$, which is of degree $k$ and maps $0$ to $-1$. Now taking the negative sign (does not change the norm), we can WLOG restrict the search to $\tilde{\Pi}_{k}:=\{ p \in \Pi_{k} ~|~ p(0)=1 \}$: $\begin{align*} \| A\hat{x}_{k}-b \|=\min_{p \in \tilde{\Pi}_{k}}\| p(A)b \|&= \min_{p} \| Xp(\Lambda)X^{-1}b \| \\ &\le \| X \| \| X^{-1} \| \min_{p\in \tilde{\Pi}_{k}} \| p(\Lambda) \|_{2} \| b \| \\ &\le \kappa_{2}(X)\cdot\min_{p\in \tilde{\Pi}_{k}} \max_{\lambda} | p(\lambda) |\| b \|. \end{align*}$ Therefore, the error is bounded by finding polynomial $p$ that does not blow up at the eigenvalues $\lambda$ of $A$. This bound is useful when: - All $\lambda$ are clustered around some value, say radius $r$ around $z$, then $p: \lambda \mapsto C(z-\lambda)^{k-1}$ (where $C:= z^{-(k-1)}$ is a normalizing constant) guarantees an error bound of $\approx (r / z)^{k-1} \| b \|$. - $\lambda$ taking a few discrete values (say $z_{1},\dots,z_{k}$ for some $k \ll n$), then $p: \lambda \mapsto C\prod_{i=1}^{k}(z_{i}-\lambda)$ maps all eigenvalues to $0$ in $k$ iterations, i.e. GMRES finds the exact solution. ### Preconditioning To guarantee this bound, we can do **preconditioning** by pre-applying some matrix $M$ to solve $MAx=Mb$instead -- hopefully this will alter the eigenvalues enough to improve the bound. Common choices include: - An incomplete [[LU Factorisation]] of $A \approx LU$, so $M:= U^{-1}L^{-1}$ will give $MA \approx I$, i.e. eigenvalues clustered around $1$. - In general, the "best" pre-conditioner is just $M= A^{-1}$ (which is not attainable), so preconditioning is essentially approximating $A^{-1}$.