> [!tldr] Krylov Subspace > The order-$k$ **Krylov subspace** of a matrix $A \in \mathbb{R}^{n\times n}$ and vector $b \in \mathbb{R}^{n}$ is $\mathcal{K}_{k}(A,b):=\mathrm{span}\{ b,Ab, \dots, A^{k-1}b \}.$ > > Equivalently, it is $\mathcal{K}_{k}(A,b)=\{ p(A)b ~|~p \in \Pi_{k-1} \},$where $\Pi_{k-1}$ is the set of polynomials of degree (up to) $k-1$. > > It is widely used to find iterative solutions $\min_{x\in \mathcal{K}_{k}}f(x)$ that approximate $\min_{x}f(x)$ for some objective $f$, e.g. $\| Ax-b \|$. Suppose we wish to find an orthogonal basis for $\mathcal{K}_{k}(A,b)$. The naive idea of forming the matrix $X_{k}:=[b, Ab, \dots, A^{k-1}b]$ and running [[QR Factorization]] (generate, then orthogonalize) does not work, because $X_{k}$ is very ill-conditioned (due to it being the results of a [[Power Iterations|power iteration]], which converges to an eigenvector of $A$). > [!exposition]- The condition number of $X_{k}$ > > Since $\| X_{k}e_{k} \| =\| A^{k-1}b \|=O(| \lambda_{1} |^{k})$, $\sigma_{1}(X_{k})$ grows as fast as $| \lambda_{1} |^{k}$. > > On the other hand, using the convergence of $A^{k}b$ to the dominant eigenvector, we have > $\| X_{k}(0,\dots,-\lambda_{1},1) \|=\| A^{k-1}b-\lambda A^{k-2}b \| =O(| \lambda_{2} |^{k}),$so $\sigma_{k}(X_{k})$ is bounded above by $O(| \lambda_{2} |^{k})$. > > As a result, the [[Numerical Stability#Conditioning|condition number]] is $\kappa_{2}(X_{k})= \sigma_{1}(X_{k}) / \sigma_{k}(X_{k}) =O(| \lambda_{1} / \lambda_{2} |^{k}) \to \infty$, and this naive basis becomes extremely ill-conditioned. ### Arnoldi Iterations However, a similar idea of *orthogonalize while generating* gives the **Arnoldi iterations**: > [!algorithm] Arnoldi Iterations > Initialize a matrix $H_{k}=(h_{ij}) \in \mathbb{R}^{(k+1)\times k}$ for storing G-S coefficients. > Let $q_{1}:= b / \| b \|$. > > For $[j=1,\dots, k]$: > - Compute $v:=Aq_{k}$ to be orthogonalized. > - Orthogonalize it wrt. previously computed $q_{i}$, for $i=1,\dots,j$: > - Record $h_{ij}:= v^{T}q_{i}$ as the component in the direction of $q_{i}$. > - Orthogonalize wrt. $q_{i}$ with $v=v-h_{ij}q_{i}$. > - Let $h_{(j+1), j}:=\| v \|$, and normalize $q_{j+1}:= v / \| v \|$. > > Return the basis $\{ q_{1},\dots,q_{k+1} \}$ and Hessenberg matrix $H_{k}:= (h_{ij})$. In each iteration, we computed $q_{j+1}:= \frac{1}{h_{(j+1), j}}(Aq_{j}-h_{1j}q_{1}-\dots-h_{jj}q_{j}).~~~~~(\dagger)$ > [!exposition] Matrix Form of the Arnoldi Iterations > Rearranging $(\dagger)$ gives $Aq_{j}=h_{1j}q_{1}+\dots+h_{(j+1),j}q_{j+1}=Q_{j+1}\begin{pmatrix} > h_{1j} \\ \vdots \\ h_{(j+1), j} > \end{pmatrix}.$ > > So collecting each $j$ into the matrix form, $\bbox[teal, 4px]{AQ_{k}=Q_{k+1}\tilde{H}_{k}},~~\tilde{H}_{k}= \begin{bmatrix} > H_{k} \\ \begin{array}{}0 & \dots & h_{(k+1),k}\end{array} > \end{bmatrix}:=\begin{bmatrix} > h_{11} & h_{12} & \dots & h_{1k} \\ > h_{21} & h_{22} & \dots & h_{2k} \\ > & \ddots & & \vdots \\ > & & h_{k,(k-1)} & h_{kk} \\ > & & & h_{(k+1),k} > \end{bmatrix}.$ > In this case we can also write $AQ_{k}=Q_{k}H_{k}+q_{k+1}(0,\dots,0,h_{(k+1),k})$, so left applying $Q_{k}^{T}$ gives $H_{k}=Q_{k}^{T}AQ_{k}.$ From $(\dagger)$ we can prove with induction that each $q_{j+1}$ equals $p_{j}(A)b$ for some polynomial $p_{j}$ of order exactly $j$, and hence $\mathrm{span}\{ q_{1},\dots,q_{j+1} \}=\mathrm{span}(b, Ab, \dots, A^{j}b)$. - If $h_{(j+1),j}=0$, it is caused by $Aq_{j} \in \mathrm{span}\{ q_{1},\dots,q_{j} \},$ i.e. $A^{j+1}b \in \mathrm{span}\{ b, \dots, A^{j}b \}$, and $b$ must be in the column space of $A$: $\left[ A^{j+1}b=\sum_{i=0}^{j}\alpha_{i}A^{i}b \right]\Longrightarrow \left[ b= \frac{1}{\alpha_{0}}\left( A^{j+1}b-\sum_{i=1}^{j} \alpha_{i}A^{i}b\right) \right]$by rearranging. For common applications of Krylov subspaces, this is actually good news. ### Lanczos for Symmetric $A$ If $A$ is symmetric, then Arnoldi is also known as **Lanczos iteration**, which produces $AQ_{k}=Q_{k+1}\tilde{T}_{k},~~\tilde{T}_{k}= \begin{bmatrix} h_{11} & h_{12} \\ h_{21} & h_{22} & h_{23} & \\ & \ddots & \ddots & \ddots \\ & & h_{k,(k-1)} & h_{kk} \\ & & & h_{(k+1),k} \end{bmatrix},$ as the Hessenberg ${H}_{k}=Q_{k}^{T}A Q_{k}$ is reduced to a tridiagonal matrix $T_{k}$, which makes up all but the last row of $\tilde{T}_{k}$. Notably, $(\dagger)$ reduces to $Aq_{j}=h_{j,j-1}q_{j-1}+h_{jj}q_{j}+h_{j,j+1}q_{j+1},$a three-term recurrence relationship (rather than $O(k)$ of Arnoldi).