## Vector Spaces and Rings
* A **field** is a set $\mathbb F$ equipped with two binary operations $+$ and $\times$ where:
* $(\mathbb F,+,0)$ and $(\mathbb F\setminus \{0\},\times,1)$ form two Abelian groups;
* $\times$ distributes over $+$.
* The **characteristic** of a field is $\min \{p:\underbrace{1+\cdots+1}_{p\text{ times}}=0\}$, if such a $p$ exists, and $0$ otherwise.
* In $\mathbb{Z}_p,\,p$ prime, the characteristic is $p$.
* In $\mathbb Q$, the characteristic is $0$.
* $\text{Hom}(V,W)$ is the vector space of linear maps from $V$ to $W$.
* A linear map $\phi \in\mathrm{Hom}(V,W)$ is an **(vector space) isomorphism** if it is bijective.
### Rings
* A **ring** is a set $R$ equipped with two binary operations $+$ and $\times$, where $(R,+,0)$ is an Abelian group, and $\times$ is associative and distributive.
* Commutativity, identity, and inverses are not required for multiplication.
* A **subring** is a subset of a ring that is also a ring.
* In a ring $R$, an **ideal** $I \lhd R$ is a non-empty subset that is closed and absorbing: $\begin{align*}\text{closed: }\, & \forall s,t\in I,&&s-t\in I\\ \text{absorbing: }\, & \forall r\in R, &&rI,Ir \subseteq I\end{align*}$
* For example, $2\mathbb{Z}\lhd\mathbb{Z}$ is an ideal.
* *Absorbing means that "nothing goes out of $I$ via multiplication."*
* *Ideals are to rings what normal subgroups are to groups.*
* Ideals are always subrings with the definition of rings we use in this course (i.e. not requiring a multiplicative identity).
### Ring Homomorphisms
* A **ring homomorphism** is a map between rings that commutes with the ring operations.
* **First Isomorphism Theorem**: for any ring homomorphism $\phi: R\to S$, $\begin{align*}&\ker\phi\lhd R,\\&\,\text{Im}\,\phi\text{ is a subring of}\ S,\\& R/\ker \phi\cong \text{Im}\, \phi.\end{align*}$
> Proof: the first two statements can be directly verified from definitions.
> The isomorphism is given by $r+\ker\phi\to\phi(r)$; verifiy that it is indeed a homomorphism and a bijection.<u></u>
## The Polynomial Ring
* Given a field $\mathbb F$, $\mathbb F[x]$ denotes the **ring of polynomials** over it, i.e.the set of polynomials with coefficients taken from $\mathbb F$.
* *Note that $x$ is just a place holder, and we might as well represent polynomials with a list of their coefficients*
### Division Algorithm and Corollaries
* The **division algorithm** states that given polynomials $a(x), b(x)\in \mathbb F[x]$, there exists a quotient and a remainder, $q(x),r(x)\in \mathbb F[x]$ such that: $f=qg+r,\ \ \deg r<\deg f$
> Proof: induct on the degree of $f$; reduce the power in $h\equiv f-q_{1}g$ with a suitable $q_1$ that cancels out the highest power.
> Then the induction hypothesis gives $h=q_{2}g+\tilde r$, and substituting for $h$ gives the result.
* If $f\in \mathbb{F}[x],a\in \mathbb{F},$ then $\Big[f(a)=0\Big]\Rightarrow \Big[(x-a)|f\Big]$.
> Proof: apply the division algorithm to get $q,r:f=(x-a)q+r$.
> The remainder $r$, which has degree $0$ hence constant, is $r=[(x-a)q(x)+r]_{x=a}=f(a)=0$.
* **Bezout's Lemma** for polynomials: given non-zero $a,b\in \mathbb{F}[x]$, and $c=\gcd(a,b),$ $\exists s,t\in \mathbb{F}[x]:\ as+bt=c$
> Proof: by dividing the equation by c, we only need to find $s,t:as+bt=1$ when $\gcd(a,b)=1$.
> Then apply the division algorithm to get $a=qb+r$inducting on $\deg a+\deg b$, apply Bezout's lemma to $b,r$ and rearrange the terms; in the case $r=0$, $b$ must be constant to have $\mathrm{gcd}(a,b)=1$.
* Every ideal polynomial ring $I\lhd\mathbb F[x]$ is generated by one element; that is, $\exists k(x)\in \mathbb{F}[x]:I=k\mathbb{F}[x]$
> Proof: let $k$ be the least-degreed, non-zero polynomial in $I$, then every element in $I$ must be a multiple of it; otherwise, the division algorithm would give a non-zero remainder $r$.
> $r \in I$ by closure, and $\deg r<\deg k$; this is contrary to the choice of $k$.
### Matrix Polynomials
* Polynomials can be evaluated on matrices by replacing $x$ with a matrix $A$: $f(A)=a_kA^k+\cdots+a_1A+a_0I$and this gives the ring homomorphism called **evaluation map**: $E_A:\mathbb F[x]\to \mathbb{F}^{n \times n}, \, f \mapsto f(A)$
* **Eliminating polynomials** of $A \in \mathbb{F}^{n \times n}$ are polynomials that evaluate to $0$ at $A$.
* All finite-dimensional matrices have eliminating polynomials, as the infinite set $\{A,A^2,\cdots\}$ cannot be linearly independent in a finite-dimensional vector space of matrices.
* The **minimal polynomial** of $A:m_A(x)$ is a monic eliminating polynomial of the least degree.
* <u>The minimal polynomial divides all eliminating polynomials (and hence is unique)</u>.
* Hence <u>the minimal polynomial is unique</u>: any two minimal polynomials divide each other, so they must be the same because they are monic.
> Proof: the first isomorphism theorem guarantees $\{ p:p(A)=0_{n\times n} \}=\ker E_A$ to be an ideal polynomial ring. Hence it is generated by the lowest-degree element, namely the minimal polynomial.
### Eigenvalues and Polynomials
* The eigenvalues and minimal/characteristic polynomials of a linear map is those of its matrix (under any basis).
* Minimal and characteristic polynomials of linear map $T$ are invariant under basis changes, so they are well-defined.
> Proof: fix a change-of-basis matrix $P \in GL_{n}(\mathbb{F})$, where under the new basis, $T$ has matrix $C=P^{-1}AP$.
>
> <u>Minimal</u>: Any polynomial $f$ has $f(A)=0\iff f(C)=P^{-1}f(A)P=0$ by invertibility of $P$. Then $A$ and $C$ have the same eliminating, hence minimal, polynomials.
>
> <u>Characteristic</u>: note that $\det (P^{-1}AP)=\det (P^{-1})\det (A)\det (P)=\det (A)$, so the characteristics $\chi_{A}$ and ${\chi}_{C}$ are equal: $\chi_{A}=\det(A-\lambda I)=\det(P^{-1}(A-\lambda I)P)=\det(P^{-1}AP-\lambda I)={\chi}_{C}$
* Eigenvalues of $A$ are the same as roots of $m_A(x)$.
> Proof: ($\Rightarrow$) eigenvalue $\lambda$ satisfies $Av=\lambda v$ for some eigenvector $v\ne \bf 0$,
> so $m_{A}(\lambda)v=m_A(A)v=\bf 0$ by linearity, hence $m_A(\lambda)=0$
>
> ($\Leftarrow$) root $\lambda$ of $m_A(x)$ gives $m_A=(x-\lambda)g(x)$ for some $g:g(A)\ne0$ (by minimality of $m_A$),
> Take $w:g(A)w\ne \bf 0$, then $g(A)w$ is an eigenvector of $A$ with eigenvalue $\lambda$: $(A-\lambda I)(g(A)w)=m_{A}(A)w=\mathbf{0}$
## Quotient Spaces
* Given vector space $V$ and subspace $U\le V$, the **quotient space** is the vector space formed by the set of the cosets of $U$: $\begin{align*}& V / U \equiv\{v+U:v\in V\},\text{ where }v+U\equiv\{v+u:u\in U\}\\ \\&(v_1+U)+(v_2+U)\equiv(v_1+v_2)+U\\ &k(v+U)\equiv(kv)+U\end{align*}$
### Quotients and Direct Sums
*Big idea: the quotient space is isomorphic with a direct sum.*
* The quotient space is "generated" by the basis vectors of $V$ that is not in $U$: given a basis $\mathcal E\subset U$ and extend it to $\mathcal F\subset V$, then
$\bar{\mathcal B}\equiv \{b+U:b\in\mathcal F/\mathcal E\}\text{ is a basis for } V/U$
* Conversely, a basis $\bar{\mathcal B}=\{b_k+U\}$ of $V/U$ can join a basis $\mathcal E\subset U$ to become a basis of $V$: $\{b_k\}\cup \mathcal E \text{ is a basis of }V$
* Hence $W=\langle\{b_k\}\rangle$ forms a direct sum with $U$: $\begin{align*}&W\oplus U=V\\ &W\cong V/U\end{align*}$
### First Isomorphism Theorem (for vector spaces)
* **First isomorphism theorem**: given a linear map $T:V\to W$, the following is a vector space isomorphism: $\begin{align*}\bar T:\ &V / \ker T\to \text{Im }T\\&v+\ker T \mapsto T(v)\end{align*}$
> Proof: FIT for groups gives $\bar T$ is a isomorphism of groups, and verifying $\bar T$ compatible with scalar multiplication completes the proof.
* The **rank-nullity theorem** is an immediate corollary: in a finite-dimensional vector space, $\bar T$ being an isomorphism guarantees that $\dim V-\dim\ker T=\dim(V/\ker T)=\dim\text{Im }T$
### Maps between Quotients
* Given a linear map $T:V\to W, A\subseteq V, B\subseteq W$, we can construct $\begin{align*}\bar T: \ &V/A\to W/B\\ & v+A\to T(v)+B\end{align*}$
* $\bar{T}$ is linear and well-defined $\iff$ $T(A)\subseteq B$.
>Proof: ($\Rightarrow$) given well-defined $\bar T, a\in A$, since $\bar{T}$ is linear, $\bar T(a+A)=\bar T(0_{V/A})=0_{W/B}=B,$
>But by definition of $\bar{T}$, $\bar{T}(a+A)=T(a)+B$. So $T(a)\in B$.
>
>($\Leftarrow$) Suppose $T(A)\subseteq B$, and $v,w\in V$ represent the same coset $v+A=w+A \in V/A$, so $v-w \in A$. Then they are mapped to the same coset $T(v)+B=T(w)+B\in W/B$; this is because $T(v)-T(w)=T(v-w) \in T(A) \subseteq B$.
### Decomposing Maps with Quotients
* Outline: given the definitions in the section above, we can break down $T$ into $(1)\ T|_{A}$ mapping $A\to B$, and $(2)\ \bar{T}$ mapping $V/A\to W/B$. $\begin{align*}
&\,A \hphantom{^*}\xRightarrow{\,\,\,T|_{A}\,\,}B\\
(1)&\uparrow \hphantom{^*\xRightarrow{\,\,\,\,T\,\,\,\,}} \uparrow\\
&\,V \hphantom{^*}\xRightarrow{\,\,\,\,T\,\,\,\,}W\\
(2)&\downarrow \hphantom{^*\xRightarrow{\,\,\,\,T\,\,\,\,}} \downarrow\\ &\,A^{*}\hphantom{^*\xRightarrow{\,\,\,T\,\,\,\,}}B^{*} \\
&\cong \hphantom{^*\xRightarrow{\,\,\,\,\,\,\,\,}} \cong\\
V& / A \,\xRightarrow{\,\,\bar{T}\,\,\,\,} W / B\\
\end{align*}$
* Defining the bases:
> Get bases $\mathcal{B}_{A,B}$ of $A,B$.
> Extend them to bases $\mathcal{B}_{V,W}$ of $V,W$.
> $A^*$ has basis $\mathcal{B}_{A^*}=\mathcal{B}_{V}\setminus \mathcal{B}_{A}$, similar for $B^*$.
> The induced basis of $V / A$ is $\mathcal{\bar{B}}_{A^*}=\{b+A:b\in \mathcal{B}_{A^*}\}$, similar for $W / B$. This gives the isomorphisms $V / A \cong A^{*}, W / B \cong B^*$.
* $(1)$ The component of $T$ mapping $A\to B$:
> Define $T|_A:A\to B$ to be $T$ restricted to $A$, with matrix ${}_{\mathcal B_B}[T|_A]_{\mathcal B_A}$.
> As a map $T|_{A}:A \to W$, its matrix is ${}_{\mathcal{B}_{W}}[T|_{A}]_{\mathcal{B}_{A}}=\begin{pmatrix}
{}_{\mathcal B_B}[T|_A]_{\mathcal B_A} \\ 0
\end{pmatrix}$where the lower half is $0$ since they represent the components that $T|_{A}(\mathcal{B}_{A})=T(\mathcal{B}_{A}) \subseteq B$ has in terms of $\mathcal B_{B^*}$, of which there is none.
- $(2)$ The component mapping $V/A\to W/B$:
> We define $\bar T$ as in the previous section, with the matrix $_{\mathcal{\bar{B}}_{B^{*}}}[\bar T]_{\mathcal {\bar{B}}_{A^{*}}}$.
> Under $\bar{T}$, the basis $(a^{*}+A) \in \bar{\mathcal{B}}_{A^{*}}$ is mapped to $(b^{*}+B)$ for some $b^{*} \in B^{*}$. By definition of $\bar{T}$, $T|_{A^*}(a^{*})$ must have the form $b^{*} +b$ for some $b \in B$.
> Hence $T|_{A^{*}}$ has the matrix $M_{T|_{A^{*}}}=\begin{pmatrix}
\text{$T|_{A^*}(\mathcal{B}_{A^{*}})$ in terms of }\mathcal{B}_{B} \\
\text{$T|_{A^*}(\mathcal{B}_{A^{*}})$ in terms of }\mathcal{B}_{B^{*}}
\end{pmatrix} = \begin{pmatrix} * \\
\ _{\mathcal{\bar{B}}_{B^{*}}}[\bar T]_{\mathcal {\bar{B}}_{A^{*}}}
\end{pmatrix}$
* The whole map put back together:
>The matrix of $T$ under the bases $\mathcal B_V,\mathcal B_W$ is the concatenation of the two components: $_{\mathcal B_W}[T]_{\mathcal B_V}=\begin{pmatrix}
[T|_{A}] & [T|_{A^{*}}]
\end{pmatrix}
=\begin{pmatrix}{}_{\mathcal B_B}[T|_A]_{\mathcal B_A} & \ast \\ 0 & _{\mathcal{\bar{B}}_{B^{*}}}[\bar T]_{\mathcal {\bar{B}}_{A^{*}}}
\end{pmatrix}$which is block upper triangular.
## Triangular Forms and Cayley-Hamilton Theorem
### Invariant Subspaces
* Given $U\le V, T:V\to V$, then $U$ is called **$\mathbf{T}$-invariant** if $T(U)\subseteq U$.
* Furthermore, by linearity $U$ is also invariant under any polynomial of $T$.
* Hence $U$ induces the map $\bar{T}_{U}:V / U\to V / U;\,\,v+U\to T(v)+U$.
* For $T: V\to V$, and $U\le V$ invariant under $T$, there is $\chi_{T}=\chi_{T|_U}\cdot\chi_{\bar{T}}$
> Proof: apply the result from the previous section on $T$ and $U$ to get a matrix $M_{T}=\begin{pmatrix}M_{T|_{U}} & \ast \\ 0 & M_{\bar{T}}\end{pmatrix}$. Evaluate the LHS as $\det(M_{T}-\lambda I)$ gives the equality.
### Upper-Triangular Matrices
* Given $T: V\to V$, if the characteristic polynomial of $T$ have all-linear factors, there is a basis under which it is upper-triangular.
> Proof: induct on the dimension of $T$; for $1\times 1$ matrices, the result is trivial.
>
> Inductive step: take any eigenvalue $\lambda$ of $T$ and (one of) its corresponding eigenvector $v$, then the subspace $U=\langle v \rangle$ is invariant under $T$.
> So we can find a basis where $M_{T}=\begin{pmatrix}M_{T|_{U}}=(\lambda) & \ast \\ 0 & M_{\bar{T}}\end{pmatrix}$. Since $\chi_{T}=\chi_{T|_U}\cdot\chi_{\bar{T}}$, $\chi_{\bar{T}}$ must also be a product of linear factors, so the induction hypothesis makes the bottom right matrix upper-triangular, so the entire $M_{T}$ is upper-triangular. This completes the induction.
* For upper-triangular matrix $A$ with diagonal elements $\lambda_{1},\dots ,\lambda_{n}$, its characteristic polynomial is eliminating; that is, $\chi_{A}(A)=\prod^{n}_{i=1}(A-\lambda_{i} I)=0_{n \times n}$
> Proof: take a basis $\{v_1,\dots,v_{n}\}$ of $\mathbb{F}^n$, so $A$ represents a linear map $\mathbb{F}^{n}\to \mathbb{F}^n$ under that basis.
> Then by looking at the entries in $(A-\lambda_{k}I)$, note that the first $k$ columns are non-$0$ only in the first $(k-1)$ entries.
> Hence inductively, $\begin{align*}
\mathrm{Im}(A-\lambda_{n}I) &\subseteq \langle v_{1}, \dots v_{n-2},\,v_{n-1} \rangle\\
\mathrm{Im}[(A-\lambda_{n-1}I)(A-\lambda_{n}I)]& \subseteq\langle v_{1}, \dots, v_{n-2} \rangle\\
&\text{...}\\
\mathrm{Im}[(\underbrace{A-\lambda_{1}I)\cdots (A-\lambda_{n-1}I)(A-\lambda_{n}I}_{\chi_{A}(A)})]&\subseteq\langle \emptyset \rangle=\emptyset
\end{align*}$
> So $\chi_{A}(A)$ must be the zero matrix.
### Cayley-Hamilton theorem
* **Cayley-Hamilton theorem**: <u>the characteristic is eliminating</u> for any generic linear map $T$ with matrix $A\in \mathbb{F}^{n \times n}$: $\chi_{T}(T)=\chi_{A}(A)=0$and so $m_{A}|\chi_{A}$.
> Proof: we reduce $A$ to an upper triangular matrix, and use the uniqueness of the characteristic polynomial.
>
> In the algebraic extension $\mathbb{\bar{F}}$ of $\mathbb{F}$, we can always factor $\chi_{A}$ into linear factors, so $A$ can be made upper-triangular in $\mathbb{\bar{F}}^{n\times n}$, say there is $P\in \mathbb{\bar{F}}^{n\times n}:P^{-1}AP$ is upper-triangular.
>
> Then by the previous result, $\chi_{P^{-1}AP}(P^{-1}AP)=0$. Since the characteristic polynomial is invariant under different bases, this is *the* CP of $T$, and it eliminates $T$.
>
> *The fact that we worked in $\mathbb{\bar{F}}[x]$ does not change the polynomial--it must have been in $\mathbb{F}[x]$ in the first place. Think $(x-i)(x+i)$ and $x^2+1$: say we found the first as $\chi_{P^{-1}AP}$ in $\mathbb{C}[x]$, but it equals the second which is in $\mathbb{R}[x]$.*
* Corollary: if the minimal is written in distinct, monic, and irreducible factors, then the characteristic must have the same factors (to an equal-or-higher power): $[m_{T}=f_{1}^{k_{1}}\dots f_{n}^{k_{n}}] \Longrightarrow [\chi_{T}=\pm f_{1}^{l_{1}}\dots f_{n}^{l_{n}}] \,\,\,\,(l_{i} \ge k_{i})$and in particular, $\chi_{T}$ has linear factorization $\iff$ $m_{T}$ has linear factorization.
> Proof: the factors must be the same (with no additional ones) since the minimal and characteristic polynomials have the same roots in $\mathbb{\bar{F}}$.
> The $\pm$ comes from $\chi_{T}$ having $(-1)^{\dim V}$ as the leading coefficient.
## The Primary Decomposition Theorem
### Proof of the PDT
* In a finite dimensional vector space $V$ with a linear map $T:V\to V, f \in \mathbb{F}[x]: f(T)=0$, and it can be factorized as $f(x)=a(x)b(x),\,\gcd(a,b)=1$, then $V=\ker a(T)\,\oplus\,\ker b(T)$
> Proof:
> <u>Spanning</u>: Bezout's lemma for polynomials give $s,t\in \mathbb{F}[x]: as+bt=1$.
> Write $A=a(T)$, and similarly $B,S,T,F$.
> Then $\forall v\in V:\ ASv+BTv=v \ \ \ (*)$Since $F=BA=AB=0$, we have $B(ASv)=(BA)(Sv)=0$, so $ASv\in \ker B$, similarly $BTv\in \ker A$.
> Now $(*)$ is a decomposition of $v$ in terms of $\ker A +\ker B$.
>
> <u>Disjointness</u>: any $v\in \ker A\cap \ker B$ would make $\mathrm{LHS}$ of $(*) \ \mathbf{0}$, so $\mathrm{RHS}=v=\mathbf{0}$ as well.
* In particular, if $f=m_{T}$, $a,b$ monic, then $a=m_{T|\ker a(T)}, \ b=m_{T|\ker b(T)}$.
> Proof: $a,b$ eliminate $T|_{\ker A,B}$ respectively, so they are divisible by the minimal polynomials.
>
> Write $m_{1}=m_{T|\ker A}, m_{2}=m_{T|\ker B}$,
> Then at the same time, the product $m_{1}m_{2}$ eliminates $T$:
> for any $v\in V$ decomposed into $w_{1}+w_{2},w_{1}\in \ker A, w_{2}\in \ker B$, $\begin{align*}
[m_{1}(T)m_{2}(T)]v&=m_{2}(T)[m_{1}(T)w_{1}] + m_{1}(T)[m_{2}(T)w_{2}]\\
&=0+0 = 0
\end{align*}$So $f=ab$, being the minimal polynomial of $T$, divides $m_{1}m_{2}$.
> Hence $a=m_{1}, b=m_{2}$.
* **Primary Decomposition Theorem**: given linear map $T: V\to V$, and decomposition of its minimal polynomial $m_{T}=f_{1}^{k_{1}}\dots f_{n}^{k_{n}},$ all distinct, monic, and irreducible, then if we write $W_{i}=\ker (f_{i}^{k_{i}}(T))$, there is:
* $V=W_{1}\,\oplus\,\cdots\oplus\, W_{n};$
* $W_i$ is $T$-invariant: $T(W_{i})\subset W_{i};$
* $m_{T|_{W_{i}}}=f_{i}^{k_{i}}.$
> Proof: induct on $n$ (the number of terms) using the previous two results.
>
> For invariance, take any $v \in W_{i}$. Due to $T$ commuting with any polynomial of itself, $f_{i}^{k_{i}}(T)(Tv)=T(f_{{i} }^{k_{i}}(T)v)=T(0)=0$so $Tv \in \ker(f_{i}^{k_{i}}(T))=W_{i}$.
### Reducibility Corollaries of the PDT
* Using this direct sum $\bigoplus_{i}W_{i}$, use the bases $B_{i}$ of $W_{i}$ to define a basis $B=\cup_{i} B_{i}$ of $V$, then $T$ has a block-diagonal matrix under $B$: $_{B}[T]_{B}=\text{diag}(
_{Bi}[T|_{Wi}]_{Bi})$
* $m_{T}$ can be reduced to linear, monic factors $f_{i}=(x-\lambda_{i}):\prod_{i} f_{i}^{k_{i}}=m_{T} \iff$ the map $T$ is triangularizable.
> Proof: triangularizability $\iff$ linear factorization of $\chi_{T}$ $\iff$ linear factorization of $m_{T}$. For the second equivalence, see the corollary of Cayley-Hamilton.
* All the factors of $m_{T}$ are unique and linear $\iff$ the map is diagonalizable.
> Proof:
> ($\Leftarrow$) if $\lambda_{1},\dots,\lambda_{r \le n}$ are the distinct factors in the diagonal matrix representation of $T$, then $m_{T}=\prod_{i}(x-\lambda_{i})$ eliminates $T$.
>
> ($\Rightarrow$) if $f_{i}=(x-\lambda_{i})$ are all unique, then PDT gives a direct sum $V=\bigoplus_{i}E_{\lambda i}$, where $E_{\lambda i}=\ker (T-\lambda_{i} I)$ is the eigenspace of $\lambda_{i}$.
> Any basis $B_{i}\subset E_{\lambda i}$ gives the diagonal block $[T|_{E_{\lambda i}}]=\lambda_{i}I$, so their union $B=\cup_{i} B_{i}$ gives a diagonal ${}_{B}[T]_{B}$.
## Jordan Normal Form
*I confess that I have no idea what the 2022 lecture notes are talking about. I used [the webnotes from Northwestern University](https://sites.math.northwestern.edu/~scanez/courses/334/notes/jordan-form.pdf).*
*If you are reading this solely for exam-prep, concepts like Jordan chains are NOT mentioned in the lecture notes, so don't quote them in the exams without giving a definition.*
*Big idea: JNFs describe linear transformations by showing what they do to their eigenspaces.*
### Nilpotent Maps
* A linear map $T:V\to V$ is **nilpotent** if $\exists n:T^n=0$where $0$ is the map $V \to \{ 0 \}$.
* All nilpotent maps have minimal polynomial $m_{T}(x)=x^m$.
> Proof: $x^n$ eliminates the map, the minimal polynomials must divide into it.
* For nilpotent map $T$ with $m_{T}(x)=x^m$, there must be strict inclusions $\ker(T)\subsetneq\ker(T^{2})\subsetneq \cdots\subsetneq \ker(T^m)=V$
> Proof: if there is an equality $\ker T^{k}=\ker T^{k+1}$, then $\begin{align*}
T^{k+s}v = 0 &\iff T^{k+1}(T^{s-1}v)=0\\
&\iff T^{s-1}v\in \ker T^{k+1}=\ker T^k\\
&\iff T^{k+s-1}v=0\\
&\iff \cdots\\
&\iff T^k v=0,
\end{align*}$but then taking $k+s=m$ means $\ker T^k=V$, contrary to the definition of $m_{T}(x)=x^m$ as the minimal polynomial.
* If $T$ has eigenvalue $\lambda$, the **generalized eigenspace** of $\lambda$ is defined to be$F_{\lambda}=\{ v\in V :(T-\lambda I)^{k}v=0\,\,\text{ for some }k \ge 1\}$ Also, $v$ is called a **generalized eigenvector**. For contrast, "ordinary eigenvectors/values" refer to the usual $v,\lambda: \lambda v=Tv$.
### Jordan Blocks and Jordan Normal Forms
* An $n$-dimensional **Jordan block** of (eigenvalue) $\lambda$ is the $k \times k$ matrix of the form $J_{k}(\lambda)=
\begin{pmatrix}
\lambda & 1 & & & \\
& \lambda & 1 & & \\
&& \ddots & \ddots & \\
&&& \lambda & 1 \\
&&&&\lambda
\end{pmatrix}_{k \times k}$
* The **Jordan Normal Form** (JNF) of a linear map $T$ is a matrix representation of it under some basis $\mathcal{B}$, which is block diagonal, with each block being a Jordan block: $_{\mathcal B}[T]_{\mathcal B}=\begin{pmatrix}J_{k_{1}}(\lambda_{1}) & &0 \\ & \ddots \\
0&&J_{k_{s}}(\lambda_{s})
\end{pmatrix}$where $\lambda_{1 \dots s}$ are the (not necessarily distinct) eigenvalues of $T$.
* Goal: a map $T$ can always be put into JNF, if its minimal polynomial can be factorized into linear factors (not necessarily distinct).
* In particular, $T$ in complex vector spaces always have JNFs.
### Proof: JNF on Generalized Eigenspaces
*Big idea: $T$ restricted to a single generalized eigenspace can be represented with a JNF. We would then glue them together into a larger JNF that represents the unrestricted map $T$.*
* For a map $T$ with eigenvalue $\lambda$, and a generalized eigenvector $v\in F_{\lambda}$, $S=T-\lambda I$, then taking $k=\min\{k:S^{k}v=0\}$ gives the **Jordan chain** $c(v)=\{S^{k-1}v,\dots,S^{2}v,Sv,v\}$and we will treat it as an ordered set when used as a basis.
* A Jordan chain $c(v)$ is linearly independent and hence is the basis of $C(v)=\langle c(v) \rangle$.
> Proof: if $\sum_{i=0}^{k-1}a_{i}S^{i}v=0$, then left-applying $S^{k-1}$ gives $a_{0}S^{k-1}v=0$, so $a_{0}=0$.
> Again, left-applying $S^{k-2}$ now gives $a_{1}=0$, etc. So all of $a_{i}$ must be $0$, and the chain is linearly independent.
* The generalized eigenspace $F_{\lambda}$ has a basis $B_{\lambda}=\cup_{i}c(v_{i})$, where $c(v_{i})$ are Jordan chains in $F_{\lambda}$.
> Proof: take the longest Jordan chain $c(v_{1})$ in $F_{\lambda}$, and if its span does not cover the entire generalized eigenspace, i.e. $C(v_{1})=\langle c(v_{1}) \rangle \ne F_{\lambda}$, now use more chains to span its "complement" $O^{*}:F_{\lambda}=O(v_{1})\,\oplus O^*$.
> Repeat the process for $O^*$ in finding the longest Jordan chain $c(v_{2})$, and taking complements, etc.
>
> Eventually the chains will exhaust the dimensions of $F_{\lambda}$ and give a basis of it, namely $B_{\lambda}=\cup_{i}\ c(v_{i})$. It is indeed a basis, as $B_{\lambda}$ is linearly independent because each chain is linearly independent within, and different chains are from subspaces that form a direct sum.
* $T|_{C(v)}$ under the basis $c(v)$ is represented by the Jordan block $J_{k}(\lambda)$.
> Proof: note that the $i$th column represent the image of the $i$th basis vector, which is $S^{i-1}v$.
> Since it is mapped to $\lambda S^{i-1} v+S^{i}v$, the column is $(\dots,0,1,\underset{i\text{th}}{\lambda},0,\dots)^t$.
> Of course, if $i=1$, the image is just $\lambda S^{k-1}v$, and the column is $(\lambda,0,\dots,0)^{T}$.
> Either case, this is also the $i$th column of the Jordan matrix.
- Hence in general, $T|_{F_{\lambda}}$ can be made into a JNF of Jordan blocks $J_{k_{i}}(\lambda)$.
> Proof: apply the previous result to the ordered set-basis $B_{\lambda}=\cup_{i}c(v_{i})$ of $F_{\lambda}$.
### Proof: JNF on the Entire Vector Space
*Big idea: we decompose the vector space into generalized eigenspaces, where the map has a JNF, and piece them together.*
* Say the minimal polynomial is factorized into$m_{A}(x)=(x-\lambda _{1})^{k_{1}}\dots(x-\lambda_{r})^{k_{r}}$where $\lambda_{1 \dots r}$ are distinct eigenvalues.
* Since we assume linear factorization of the minimal polynomial, the Primary Decomposition Theorem guarantees the block-diagonal form $\begin{align*}
_{B}[T]_{B}=\underset{i=1\sim r}{\mathrm{diag}}(_{B \lambda_{i}}[T|_{F \lambda_{i}}]_{B \lambda_{i}}) & &(*)
\end{align*}$for some to-be-determined basis $B_{\lambda_{i}}$ of $F_{\lambda_{i}}=\ker(A-\lambda_{i} I)^{k_{i}}$.
* The previous section gives a basis $B_{\lambda_{i}}$ composed of Jordan chains, which makes the $i$th block a JNF. Now under their union $B=\cup_{i=1}^{r}B_{\lambda_{i}}$, all the diagonal blocks are Jordan blocks, hence the map is made into the JNF $_{B}[T]_{B}=\begin{pmatrix}J_{k_{1}}(\lambda_{1}) & &0 \\ & \ddots \\
0&&J_{k_{s}}(\lambda_{r})
\end{pmatrix}$and hence the JNF always exists, given a linear factorization of the minimal polynomial.
### Solving the Jordan Basis and JNF
* The **Jordan basis** of a map is the basis under which the map is represented by its JNF. As seen earlier, it is composed of Jordan chains.
* <u>The number of Jordan blocks of eigenvalue</u> $\lambda$ <u>equals its geometric multiplicity,</u> i.e. the dimension of its ordinary eigenspace.
* For example, if $A$ has characteristic $\chi_{A}(x)=(x-1)^4$, and the eigenvalue $1$ has geometric multiplicity $2$, then there must be two Jordan blocks of diagonal $1$ in the JNF, so $\mathrm{JNF}(A)=\left(\begin{array}{c|c} \begin{array}{c c}1 & 1 \\ & 1\end{array} \\ \hline & \begin{array}{c c}1 & 1 \\ & 1\end{array} \end{array}\right)\text{ or } \left(\begin{array}{c|c} 1 \\ \hline & \begin{array}{c c}1 & 1 & \\ & 1 & 1 \\
& & 1\end{array} \end{array}\right)$
> Proof: let $J$ be the JNF, then $J-\lambda I$ would have one all-$0$ row for every Jordan block, and the dimension of $\ker (J-\lambda I)$ is exactly this.
* In the minimal polynomial, the degree $k$ of $(x-\lambda)^k$ equals the size of the largest Jordan block with diagonal $\lambda$ in the JNF.
* For example, if the same $A$ has a minimal polynomial of $m_{A}(x)=(x-1)^2$, then its largest Jordan block is $(2\times2)$. Hence it can't have $(1\times 1)+(3\times 3)$ blocks in the JNF, which must be $\mathrm{JNF}(A)=\left(\begin{array}{c|c} \begin{array}{c c}1 & 1 \\ & 1\end{array} \\ \hline & \begin{array}{c c}1 & 1 \\ & 1\end{array} \end{array}\right)$
* Example: solve for the Jordan basis of $A=\begin{pmatrix}1 & -1 & 0 \\-1 & 4 & -1 \\-4 & 13 & -3\end{pmatrix}$.
> The characteristic polynomial is $\chi_{A}(x)=-x(x-1)^2$, giving eigenvalues $0,1$. Since basis changes preserve the characteristic, this is also the characteristic of the JNF matrix.
>
> So we will be looking for a JNF with one $0$ and two $1