A0 - Linear Algebra - Random Notes Go Brrrrrrr

## Vector Spaces and Rings * A **field** is a set $\mathbb F$ equipped with two binary operations $+$ and $\times$ where: * $(\mathbb F,+,0)$ and $(\mathbb F\setminus \{0\},\times,1)$ form two Abelian groups; * $\times$ distributes over $+$. * The **characteristic** of a field is $\min \{p:\underbrace{1+\cdots+1}_{p\text{ times}}=0\}$, if such a $p$ exists, and $0$ otherwise. * In $\mathbb{Z}_p,\,p$ prime, the characteristic is $p$. * In $\mathbb Q$, the characteristic is $0$. * $\text{Hom}(V,W)$ is the vector space of linear maps from $V$ to $W$. * A linear map $\phi \in\mathrm{Hom}(V,W)$ is an **(vector space) isomorphism** if it is bijective. ### Rings * A **ring** is a set $R$ equipped with two binary operations $+$ and $\times$, where $(R,+,0)$ is an Abelian group, and $\times$ is associative and distributive. * Commutativity, identity, and inverses are not required for multiplication. * A **subring** is a subset of a ring that is also a ring. * In a ring $R$, an **ideal** $I \lhd R$ is a non-empty subset that is closed and absorbing: $\begin{align*}\text{closed: }\, & \forall s,t\in I,&&s-t\in I\\ \text{absorbing: }\, & \forall r\in R, &&rI,Ir \subseteq I\end{align*}$ * For example, $2\mathbb{Z}\lhd\mathbb{Z}$ is an ideal. * *Absorbing means that "nothing goes out of $I$ via multiplication."* * *Ideals are to rings what normal subgroups are to groups.* * Ideals are always subrings with the definition of rings we use in this course (i.e. not requiring a multiplicative identity). ### Ring Homomorphisms * A **ring homomorphism** is a map between rings that commutes with the ring operations. * **First Isomorphism Theorem**: for any ring homomorphism $\phi: R\to S$, $\begin{align*}&\ker\phi\lhd R,\\&\,\text{Im}\,\phi\text{ is a subring of}\ S,\\& R/\ker \phi\cong \text{Im}\, \phi.\end{align*}$ > Proof: the first two statements can be directly verified from definitions. > The isomorphism is given by $r+\ker\phi\to\phi(r)$; verifiy that it is indeed a homomorphism and a bijection. ## The Polynomial Ring * Given a field $\mathbb F$, $\mathbb F[x]$ denotes the **ring of polynomials** over it, i.e.the set of polynomials with coefficients taken from $\mathbb F$. * *Note that $x$ is just a place holder, and we might as well represent polynomials with a list of their coefficients* ### Division Algorithm and Corollaries * The **division algorithm** states that given polynomials $a(x), b(x)\in \mathbb F[x]$, there exists a quotient and a remainder, $q(x),r(x)\in \mathbb F[x]$ such that: $f=qg+r,\ \ \deg r<\deg f$ > Proof: induct on the degree of $f$; reduce the power in $h\equiv f-q_{1}g$ with a suitable $q_1$ that cancels out the highest power. > Then the induction hypothesis gives $h=q_{2}g+\tilde r$, and substituting for $h$ gives the result. * If $f\in \mathbb{F}[x],a\in \mathbb{F},$ then $\Big[f(a)=0\Big]\Rightarrow \Big[(x-a)|f\Big]$. > Proof: apply the division algorithm to get $q,r:f=(x-a)q+r$. > The remainder $r$, which has degree $0$ hence constant, is $r=[(x-a)q(x)+r]_{x=a}=f(a)=0$. * **Bezout's Lemma** for polynomials: given non-zero $a,b\in \mathbb{F}[x]$, and $c=\gcd(a,b),$ $\exists s,t\in \mathbb{F}[x]:\ as+bt=c$ > Proof: by dividing the equation by c, we only need to find $s,t:as+bt=1$ when $\gcd(a,b)=1$. > Then apply the division algorithm to get $a=qb+r$inducting on $\deg a+\deg b$, apply Bezout's lemma to $b,r$ and rearrange the terms; in the case $r=0$, $b$ must be constant to have $\mathrm{gcd}(a,b)=1$. * Every ideal polynomial ring $I\lhd\mathbb F[x]$ is generated by one element; that is, $\exists k(x)\in \mathbb{F}[x]:I=k\mathbb{F}[x]$ > Proof: let $k$ be the least-degreed, non-zero polynomial in $I$, then every element in $I$ must be a multiple of it; otherwise, the division algorithm would give a non-zero remainder $r$. > $r \in I$ by closure, and $\deg r<\deg k$; this is contrary to the choice of $k$. ### Matrix Polynomials * Polynomials can be evaluated on matrices by replacing $x$ with a matrix $A$: $f(A)=a_kA^k+\cdots+a_1A+a_0I$and this gives the ring homomorphism called **evaluation map**: $E_A:\mathbb F[x]\to \mathbb{F}^{n \times n}, \, f \mapsto f(A)$ * **Eliminating polynomials** of $A \in \mathbb{F}^{n \times n}$ are polynomials that evaluate to $0$ at $A$. * All finite-dimensional matrices have eliminating polynomials, as the infinite set $\{A,A^2,\cdots\}$ cannot be linearly independent in a finite-dimensional vector space of matrices. * The **minimal polynomial** of $A:m_A(x)$ is a monic eliminating polynomial of the least degree. * The minimal polynomial divides all eliminating polynomials (and hence is unique). * Hence the minimal polynomial is unique: any two minimal polynomials divide each other, so they must be the same because they are monic. > Proof: the first isomorphism theorem guarantees $\{ p:p(A)=0_{n\times n} \}=\ker E_A$ to be an ideal polynomial ring. Hence it is generated by the lowest-degree element, namely the minimal polynomial. ### Eigenvalues and Polynomials * The eigenvalues and minimal/characteristic polynomials of a linear map is those of its matrix (under any basis). * Minimal and characteristic polynomials of linear map $T$ are invariant under basis changes, so they are well-defined. > Proof: fix a change-of-basis matrix $P \in GL_{n}(\mathbb{F})$, where under the new basis, $T$ has matrix $C=P^{-1}AP$. > > Minimal: Any polynomial $f$ has $f(A)=0\iff f(C)=P^{-1}f(A)P=0$ by invertibility of $P$. Then $A$ and $C$ have the same eliminating, hence minimal, polynomials. > > Characteristic: note that $\det (P^{-1}AP)=\det (P^{-1})\det (A)\det (P)=\det (A)$, so the characteristics $\chi_{A}$ and ${\chi}_{C}$ are equal: $\chi_{A}=\det(A-\lambda I)=\det(P^{-1}(A-\lambda I)P)=\det(P^{-1}AP-\lambda I)={\chi}_{C}$ * Eigenvalues of $A$ are the same as roots of $m_A(x)$. > Proof: ($\Rightarrow$) eigenvalue $\lambda$ satisfies $Av=\lambda v$ for some eigenvector $v\ne \bf 0$, > so $m_{A}(\lambda)v=m_A(A)v=\bf 0$ by linearity, hence $m_A(\lambda)=0$ > > ($\Leftarrow$) root $\lambda$ of $m_A(x)$ gives $m_A=(x-\lambda)g(x)$ for some $g:g(A)\ne0$ (by minimality of $m_A$), > Take $w:g(A)w\ne \bf 0$, then $g(A)w$ is an eigenvector of $A$ with eigenvalue $\lambda$: $(A-\lambda I)(g(A)w)=m_{A}(A)w=\mathbf{0}$ ## Quotient Spaces * Given vector space $V$ and subspace $U\le V$, the **quotient space** is the vector space formed by the set of the cosets of $U$: $\begin{align*}& V / U \equiv\{v+U:v\in V\},\text{ where }v+U\equiv\{v+u:u\in U\}\\ \\&(v_1+U)+(v_2+U)\equiv(v_1+v_2)+U\\ &k(v+U)\equiv(kv)+U\end{align*}$ ### Quotients and Direct Sums *Big idea: the quotient space is isomorphic with a direct sum.* * The quotient space is "generated" by the basis vectors of $V$ that is not in $U$: given a basis $\mathcal E\subset U$ and extend it to $\mathcal F\subset V$, then $\bar{\mathcal B}\equiv \{b+U:b\in\mathcal F/\mathcal E\}\text{ is a basis for } V/U$ * Conversely, a basis $\bar{\mathcal B}=\{b_k+U\}$ of $V/U$ can join a basis $\mathcal E\subset U$ to become a basis of $V$: $\{b_k\}\cup \mathcal E \text{ is a basis of }V$ * Hence $W=\langle\{b_k\}\rangle$ forms a direct sum with $U$: $\begin{align*}&W\oplus U=V\\ &W\cong V/U\end{align*}$ ### First Isomorphism Theorem (for vector spaces) * **First isomorphism theorem**: given a linear map $T:V\to W$, the following is a vector space isomorphism: $\begin{align*}\bar T:\ &V / \ker T\to \text{Im }T\\&v+\ker T \mapsto T(v)\end{align*}$ > Proof: FIT for groups gives $\bar T$ is a isomorphism of groups, and verifying $\bar T$ compatible with scalar multiplication completes the proof. * The **rank-nullity theorem** is an immediate corollary: in a finite-dimensional vector space, $\bar T$ being an isomorphism guarantees that $\dim V-\dim\ker T=\dim(V/\ker T)=\dim\text{Im }T$ ### Maps between Quotients * Given a linear map $T:V\to W, A\subseteq V, B\subseteq W$, we can construct $\begin{align*}\bar T: \ &V/A\to W/B\\ & v+A\to T(v)+B\end{align*}$ * $\bar{T}$ is linear and well-defined $\iff$ $T(A)\subseteq B$. >Proof: ($\Rightarrow$) given well-defined $\bar T, a\in A$, since $\bar{T}$ is linear, $\bar T(a+A)=\bar T(0_{V/A})=0_{W/B}=B,$ >But by definition of $\bar{T}$, $\bar{T}(a+A)=T(a)+B$. So $T(a)\in B$. > >($\Leftarrow$) Suppose $T(A)\subseteq B$, and $v,w\in V$ represent the same coset $v+A=w+A \in V/A$, so $v-w \in A$. Then they are mapped to the same coset $T(v)+B=T(w)+B\in W/B$; this is because $T(v)-T(w)=T(v-w) \in T(A) \subseteq B$. ### Decomposing Maps with Quotients * Outline: given the definitions in the section above, we can break down $T$ into $(1)\ T|_{A}$ mapping $A\to B$, and $(2)\ \bar{T}$ mapping $V/A\to W/B$. $\begin{align*} &\,A \hphantom{^*}\xRightarrow{\,\,\,T|_{A}\,\,}B\\ (1)&\uparrow \hphantom{^*\xRightarrow{\,\,\,\,T\,\,\,\,}} \uparrow\\ &\,V \hphantom{^*}\xRightarrow{\,\,\,\,T\,\,\,\,}W\\ (2)&\downarrow \hphantom{^*\xRightarrow{\,\,\,\,T\,\,\,\,}} \downarrow\\ &\,A^{*}\hphantom{^*\xRightarrow{\,\,\,T\,\,\,\,}}B^{*} \\ &\cong \hphantom{^*\xRightarrow{\,\,\,\,\,\,\,\,}} \cong\\ V& / A \,\xRightarrow{\,\,\bar{T}\,\,\,\,} W / B\\ \end{align*}$ * Defining the bases: > Get bases $\mathcal{B}_{A,B}$ of $A,B$. > Extend them to bases $\mathcal{B}_{V,W}$ of $V,W$. > $A^*$ has basis $\mathcal{B}_{A^*}=\mathcal{B}_{V}\setminus \mathcal{B}_{A}$, similar for $B^*$. > The induced basis of $V / A$ is $\mathcal{\bar{B}}_{A^*}=\{b+A:b\in \mathcal{B}_{A^*}\}$, similar for $W / B$. This gives the isomorphisms $V / A \cong A^{*}, W / B \cong B^*$. * $(1)$ The component of $T$ mapping $A\to B$: > Define $T|_A:A\to B$ to be $T$ restricted to $A$, with matrix ${}_{\mathcal B_B}[T|_A]_{\mathcal B_A}$. > As a map $T|_{A}:A \to W$, its matrix is ${}_{\mathcal{B}_{W}}[T|_{A}]_{\mathcal{B}_{A}}=\begin{pmatrix} {}_{\mathcal B_B}[T|_A]_{\mathcal B_A} \\ 0 \end{pmatrix}$where the lower half is $0$ since they represent the components that $T|_{A}(\mathcal{B}_{A})=T(\mathcal{B}_{A}) \subseteq B$ has in terms of $\mathcal B_{B^*}$, of which there is none. - $(2)$ The component mapping $V/A\to W/B$: > We define $\bar T$ as in the previous section, with the matrix $_{\mathcal{\bar{B}}_{B^{*}}}[\bar T]_{\mathcal {\bar{B}}_{A^{*}}}$. > Under $\bar{T}$, the basis $(a^{*}+A) \in \bar{\mathcal{B}}_{A^{*}}$ is mapped to $(b^{*}+B)$ for some $b^{*} \in B^{*}$. By definition of $\bar{T}$, $T|_{A^*}(a^{*})$ must have the form $b^{*} +b$ for some $b \in B$. > Hence $T|_{A^{*}}$ has the matrix $M_{T|_{A^{*}}}=\begin{pmatrix} \text{$T|_{A^*}(\mathcal{B}_{A^{*}})$ in terms of }\mathcal{B}_{B} \\ \text{$T|_{A^*}(\mathcal{B}_{A^{*}})$ in terms of }\mathcal{B}_{B^{*}} \end{pmatrix} = \begin{pmatrix} * \\ \ _{\mathcal{\bar{B}}_{B^{*}}}[\bar T]_{\mathcal {\bar{B}}_{A^{*}}} \end{pmatrix}$ * The whole map put back together: >The matrix of $T$ under the bases $\mathcal B_V,\mathcal B_W$ is the concatenation of the two components: $_{\mathcal B_W}[T]_{\mathcal B_V}=\begin{pmatrix} [T|_{A}] & [T|_{A^{*}}] \end{pmatrix} =\begin{pmatrix}{}_{\mathcal B_B}[T|_A]_{\mathcal B_A} & \ast \\ 0 & _{\mathcal{\bar{B}}_{B^{*}}}[\bar T]_{\mathcal {\bar{B}}_{A^{*}}} \end{pmatrix}$which is block upper triangular. ## Triangular Forms and Cayley-Hamilton Theorem ### Invariant Subspaces * Given $U\le V, T:V\to V$, then $U$ is called **$\mathbf{T}$-invariant** if $T(U)\subseteq U$. * Furthermore, by linearity $U$ is also invariant under any polynomial of $T$. * Hence $U$ induces the map $\bar{T}_{U}:V / U\to V / U;\,\,v+U\to T(v)+U$. * For $T: V\to V$, and $U\le V$ invariant under $T$, there is $\chi_{T}=\chi_{T|_U}\cdot\chi_{\bar{T}}$ > Proof: apply the result from the previous section on $T$ and $U$ to get a matrix $M_{T}=\begin{pmatrix}M_{T|_{U}} & \ast \\ 0 & M_{\bar{T}}\end{pmatrix}$. Evaluate the LHS as $\det(M_{T}-\lambda I)$ gives the equality. ### Upper-Triangular Matrices * Given $T: V\to V$, if the characteristic polynomial of $T$ have all-linear factors, there is a basis under which it is upper-triangular. > Proof: induct on the dimension of $T$; for $1\times 1$ matrices, the result is trivial. > > Inductive step: take any eigenvalue $\lambda$ of $T$ and (one of) its corresponding eigenvector $v$, then the subspace $U=\langle v \rangle$ is invariant under $T$. > So we can find a basis where $M_{T}=\begin{pmatrix}M_{T|_{U}}=(\lambda) & \ast \\ 0 & M_{\bar{T}}\end{pmatrix}$. Since $\chi_{T}=\chi_{T|_U}\cdot\chi_{\bar{T}}$, $\chi_{\bar{T}}$ must also be a product of linear factors, so the induction hypothesis makes the bottom right matrix upper-triangular, so the entire $M_{T}$ is upper-triangular. This completes the induction. * For upper-triangular matrix $A$ with diagonal elements $\lambda_{1},\dots ,\lambda_{n}$, its characteristic polynomial is eliminating; that is, $\chi_{A}(A)=\prod^{n}_{i=1}(A-\lambda_{i} I)=0_{n \times n}$ > Proof: take a basis $\{v_1,\dots,v_{n}\}$ of $\mathbb{F}^n$, so $A$ represents a linear map $\mathbb{F}^{n}\to \mathbb{F}^n$ under that basis. > Then by looking at the entries in $(A-\lambda_{k}I)$, note that the first $k$ columns are non-$0$ only in the first $(k-1)$ entries. > Hence inductively, $\begin{align*} \mathrm{Im}(A-\lambda_{n}I) &\subseteq \langle v_{1}, \dots v_{n-2},\,v_{n-1} \rangle\\ \mathrm{Im}[(A-\lambda_{n-1}I)(A-\lambda_{n}I)]& \subseteq\langle v_{1}, \dots, v_{n-2} \rangle\\ &\text{...}\\ \mathrm{Im}[(\underbrace{A-\lambda_{1}I)\cdots (A-\lambda_{n-1}I)(A-\lambda_{n}I}_{\chi_{A}(A)})]&\subseteq\langle \emptyset \rangle=\emptyset \end{align*}$ > So $\chi_{A}(A)$ must be the zero matrix. ### Cayley-Hamilton theorem * **Cayley-Hamilton theorem**: the characteristic is eliminating for any generic linear map $T$ with matrix $A\in \mathbb{F}^{n \times n}$: $\chi_{T}(T)=\chi_{A}(A)=0$and so $m_{A}|\chi_{A}$. > Proof: we reduce $A$ to an upper triangular matrix, and use the uniqueness of the characteristic polynomial. > > In the algebraic extension $\mathbb{\bar{F}}$ of $\mathbb{F}$, we can always factor $\chi_{A}$ into linear factors, so $A$ can be made upper-triangular in $\mathbb{\bar{F}}^{n\times n}$, say there is $P\in \mathbb{\bar{F}}^{n\times n}:P^{-1}AP$ is upper-triangular. > > Then by the previous result, $\chi_{P^{-1}AP}(P^{-1}AP)=0$. Since the characteristic polynomial is invariant under different bases, this is *the* CP of $T$, and it eliminates $T$. > > *The fact that we worked in $\mathbb{\bar{F}}[x]$ does not change the polynomial--it must have been in $\mathbb{F}[x]$ in the first place. Think $(x-i)(x+i)$ and $x^2+1$: say we found the first as $\chi_{P^{-1}AP}$ in $\mathbb{C}[x]$, but it equals the second which is in $\mathbb{R}[x]$.* * Corollary: if the minimal is written in distinct, monic, and irreducible factors, then the characteristic must have the same factors (to an equal-or-higher power): $[m_{T}=f_{1}^{k_{1}}\dots f_{n}^{k_{n}}] \Longrightarrow [\chi_{T}=\pm f_{1}^{l_{1}}\dots f_{n}^{l_{n}}] \,\,\,\,(l_{i} \ge k_{i})$and in particular, $\chi_{T}$ has linear factorization $\iff$ $m_{T}$ has linear factorization. > Proof: the factors must be the same (with no additional ones) since the minimal and characteristic polynomials have the same roots in $\mathbb{\bar{F}}$. > The $\pm$ comes from $\chi_{T}$ having $(-1)^{\dim V}$ as the leading coefficient. ## The Primary Decomposition Theorem ### Proof of the PDT * In a finite dimensional vector space $V$ with a linear map $T:V\to V, f \in \mathbb{F}[x]: f(T)=0$, and it can be factorized as $f(x)=a(x)b(x),\,\gcd(a,b)=1$, then $V=\ker a(T)\,\oplus\,\ker b(T)$ > Proof: > Spanning: Bezout's lemma for polynomials give $s,t\in \mathbb{F}[x]: as+bt=1$. > Write $A=a(T)$, and similarly $B,S,T,F$. > Then $\forall v\in V:\ ASv+BTv=v \ \ \ (*)$Since $F=BA=AB=0$, we have $B(ASv)=(BA)(Sv)=0$, so $ASv\in \ker B$, similarly $BTv\in \ker A$. > Now $(*)$ is a decomposition of $v$ in terms of $\ker A +\ker B$. > > Disjointness: any $v\in \ker A\cap \ker B$ would make $\mathrm{LHS}$ of $(*) \ \mathbf{0}$, so $\mathrm{RHS}=v=\mathbf{0}$ as well. * In particular, if $f=m_{T}$, $a,b$ monic, then $a=m_{T|\ker a(T)}, \ b=m_{T|\ker b(T)}$. > Proof: $a,b$ eliminate $T|_{\ker A,B}$ respectively, so they are divisible by the minimal polynomials. > > Write $m_{1}=m_{T|\ker A}, m_{2}=m_{T|\ker B}$, > Then at the same time, the product $m_{1}m_{2}$ eliminates $T$: > for any $v\in V$ decomposed into $w_{1}+w_{2},w_{1}\in \ker A, w_{2}\in \ker B$, $\begin{align*} [m_{1}(T)m_{2}(T)]v&=m_{2}(T)[m_{1}(T)w_{1}] + m_{1}(T)[m_{2}(T)w_{2}]\\ &=0+0 = 0 \end{align*}$So $f=ab$, being the minimal polynomial of $T$, divides $m_{1}m_{2}$. > Hence $a=m_{1}, b=m_{2}$. * **Primary Decomposition Theorem**: given linear map $T: V\to V$, and decomposition of its minimal polynomial $m_{T}=f_{1}^{k_{1}}\dots f_{n}^{k_{n}},$ all distinct, monic, and irreducible, then if we write $W_{i}=\ker (f_{i}^{k_{i}}(T))$, there is: * $V=W_{1}\,\oplus\,\cdots\oplus\, W_{n};$ * $W_i$ is $T$-invariant: $T(W_{i})\subset W_{i};$ * $m_{T|_{W_{i}}}=f_{i}^{k_{i}}.$ > Proof: induct on $n$ (the number of terms) using the previous two results. > > For invariance, take any $v \in W_{i}$. Due to $T$ commuting with any polynomial of itself, $f_{i}^{k_{i}}(T)(Tv)=T(f_{{i} }^{k_{i}}(T)v)=T(0)=0$so $Tv \in \ker(f_{i}^{k_{i}}(T))=W_{i}$. ### Reducibility Corollaries of the PDT * Using this direct sum $\bigoplus_{i}W_{i}$, use the bases $B_{i}$ of $W_{i}$ to define a basis $B=\cup_{i} B_{i}$ of $V$, then $T$ has a block-diagonal matrix under $B$: $_{B}[T]_{B}=\text{diag}( _{Bi}[T|_{Wi}]_{Bi})$ * $m_{T}$ can be reduced to linear, monic factors $f_{i}=(x-\lambda_{i}):\prod_{i} f_{i}^{k_{i}}=m_{T} \iff$ the map $T$ is triangularizable. > Proof: triangularizability $\iff$ linear factorization of $\chi_{T}$ $\iff$ linear factorization of $m_{T}$. For the second equivalence, see the corollary of Cayley-Hamilton. * All the factors of $m_{T}$ are unique and linear $\iff$ the map is diagonalizable. > Proof: > ($\Leftarrow$) if $\lambda_{1},\dots,\lambda_{r \le n}$ are the distinct factors in the diagonal matrix representation of $T$, then $m_{T}=\prod_{i}(x-\lambda_{i})$ eliminates $T$. > > ($\Rightarrow$) if $f_{i}=(x-\lambda_{i})$ are all unique, then PDT gives a direct sum $V=\bigoplus_{i}E_{\lambda i}$, where $E_{\lambda i}=\ker (T-\lambda_{i} I)$ is the eigenspace of $\lambda_{i}$. > Any basis $B_{i}\subset E_{\lambda i}$ gives the diagonal block $[T|_{E_{\lambda i}}]=\lambda_{i}I$, so their union $B=\cup_{i} B_{i}$ gives a diagonal ${}_{B}[T]_{B}$. ## Jordan Normal Form *I confess that I have no idea what the 2022 lecture notes are talking about. I used [the webnotes from Northwestern University](https://sites.math.northwestern.edu/~scanez/courses/334/notes/jordan-form.pdf).* *If you are reading this solely for exam-prep, concepts like Jordan chains are NOT mentioned in the lecture notes, so don't quote them in the exams without giving a definition.* *Big idea: JNFs describe linear transformations by showing what they do to their eigenspaces.* ### Nilpotent Maps * A linear map $T:V\to V$ is **nilpotent** if $\exists n:T^n=0$where $0$ is the map $V \to \{ 0 \}$. * All nilpotent maps have minimal polynomial $m_{T}(x)=x^m$. > Proof: $x^n$ eliminates the map, the minimal polynomials must divide into it. * For nilpotent map $T$ with $m_{T}(x)=x^m$, there must be strict inclusions $\ker(T)\subsetneq\ker(T^{2})\subsetneq \cdots\subsetneq \ker(T^m)=V$ > Proof: if there is an equality $\ker T^{k}=\ker T^{k+1}$, then $\begin{align*} T^{k+s}v = 0 &\iff T^{k+1}(T^{s-1}v)=0\\ &\iff T^{s-1}v\in \ker T^{k+1}=\ker T^k\\ &\iff T^{k+s-1}v=0\\ &\iff \cdots\\ &\iff T^k v=0, \end{align*}$but then taking $k+s=m$ means $\ker T^k=V$, contrary to the definition of $m_{T}(x)=x^m$ as the minimal polynomial. * If $T$ has eigenvalue $\lambda$, the **generalized eigenspace** of $\lambda$ is defined to be$F_{\lambda}=\{ v\in V :(T-\lambda I)^{k}v=0\,\,\text{ for some }k \ge 1\}$ Also, $v$ is called a **generalized eigenvector**. For contrast, "ordinary eigenvectors/values" refer to the usual $v,\lambda: \lambda v=Tv$. ### Jordan Blocks and Jordan Normal Forms * An $n$-dimensional **Jordan block** of (eigenvalue) $\lambda$ is the $k \times k$ matrix of the form $J_{k}(\lambda)= \begin{pmatrix} \lambda & 1 & & & \\ & \lambda & 1 & & \\ && \ddots & \ddots & \\ &&& \lambda & 1 \\ &&&&\lambda \end{pmatrix}_{k \times k}$ * The **Jordan Normal Form** (JNF) of a linear map $T$ is a matrix representation of it under some basis $\mathcal{B}$, which is block diagonal, with each block being a Jordan block: $_{\mathcal B}[T]_{\mathcal B}=\begin{pmatrix}J_{k_{1}}(\lambda_{1}) & &0 \\ & \ddots \\ 0&&J_{k_{s}}(\lambda_{s}) \end{pmatrix}$where $\lambda_{1 \dots s}$ are the (not necessarily distinct) eigenvalues of $T$. * Goal: a map $T$ can always be put into JNF, if its minimal polynomial can be factorized into linear factors (not necessarily distinct). * In particular, $T$ in complex vector spaces always have JNFs. ### Proof: JNF on Generalized Eigenspaces *Big idea: $T$ restricted to a single generalized eigenspace can be represented with a JNF. We would then glue them together into a larger JNF that represents the unrestricted map $T$.* * For a map $T$ with eigenvalue $\lambda$, and a generalized eigenvector $v\in F_{\lambda}$, $S=T-\lambda I$, then taking $k=\min\{k:S^{k}v=0\}$ gives the **Jordan chain** $c(v)=\{S^{k-1}v,\dots,S^{2}v,Sv,v\}$and we will treat it as an ordered set when used as a basis. * A Jordan chain $c(v)$ is linearly independent and hence is the basis of $C(v)=\langle c(v) \rangle$. > Proof: if $\sum_{i=0}^{k-1}a_{i}S^{i}v=0$, then left-applying $S^{k-1}$ gives $a_{0}S^{k-1}v=0$, so $a_{0}=0$. > Again, left-applying $S^{k-2}$ now gives $a_{1}=0$, etc. So all of $a_{i}$ must be $0$, and the chain is linearly independent. * The generalized eigenspace $F_{\lambda}$ has a basis $B_{\lambda}=\cup_{i}c(v_{i})$, where $c(v_{i})$ are Jordan chains in $F_{\lambda}$. > Proof: take the longest Jordan chain $c(v_{1})$ in $F_{\lambda}$, and if its span does not cover the entire generalized eigenspace, i.e. $C(v_{1})=\langle c(v_{1}) \rangle \ne F_{\lambda}$, now use more chains to span its "complement" $O^{*}:F_{\lambda}=O(v_{1})\,\oplus O^*$. > Repeat the process for $O^*$ in finding the longest Jordan chain $c(v_{2})$, and taking complements, etc. > > Eventually the chains will exhaust the dimensions of $F_{\lambda}$ and give a basis of it, namely $B_{\lambda}=\cup_{i}\ c(v_{i})$. It is indeed a basis, as $B_{\lambda}$ is linearly independent because each chain is linearly independent within, and different chains are from subspaces that form a direct sum. * $T|_{C(v)}$ under the basis $c(v)$ is represented by the Jordan block $J_{k}(\lambda)$. > Proof: note that the $i$th column represent the image of the $i$th basis vector, which is $S^{i-1}v$. > Since it is mapped to $\lambda S^{i-1} v+S^{i}v$, the column is $(\dots,0,1,\underset{i\text{th}}{\lambda},0,\dots)^t$. > Of course, if $i=1$, the image is just $\lambda S^{k-1}v$, and the column is $(\lambda,0,\dots,0)^{T}$. > Either case, this is also the $i$th column of the Jordan matrix. - Hence in general, $T|_{F_{\lambda}}$ can be made into a JNF of Jordan blocks $J_{k_{i}}(\lambda)$. > Proof: apply the previous result to the ordered set-basis $B_{\lambda}=\cup_{i}c(v_{i})$ of $F_{\lambda}$. ### Proof: JNF on the Entire Vector Space *Big idea: we decompose the vector space into generalized eigenspaces, where the map has a JNF, and piece them together.* * Say the minimal polynomial is factorized into$m_{A}(x)=(x-\lambda _{1})^{k_{1}}\dots(x-\lambda_{r})^{k_{r}}$where $\lambda_{1 \dots r}$ are distinct eigenvalues. * Since we assume linear factorization of the minimal polynomial, the Primary Decomposition Theorem guarantees the block-diagonal form $\begin{align*} _{B}[T]_{B}=\underset{i=1\sim r}{\mathrm{diag}}(_{B \lambda_{i}}[T|_{F \lambda_{i}}]_{B \lambda_{i}}) & &(*) \end{align*}$for some to-be-determined basis $B_{\lambda_{i}}$ of $F_{\lambda_{i}}=\ker(A-\lambda_{i} I)^{k_{i}}$. * The previous section gives a basis $B_{\lambda_{i}}$ composed of Jordan chains, which makes the $i$th block a JNF. Now under their union $B=\cup_{i=1}^{r}B_{\lambda_{i}}$, all the diagonal blocks are Jordan blocks, hence the map is made into the JNF $_{B}[T]_{B}=\begin{pmatrix}J_{k_{1}}(\lambda_{1}) & &0 \\ & \ddots \\ 0&&J_{k_{s}}(\lambda_{r}) \end{pmatrix}$and hence the JNF always exists, given a linear factorization of the minimal polynomial. ### Solving the Jordan Basis and JNF * The **Jordan basis** of a map is the basis under which the map is represented by its JNF. As seen earlier, it is composed of Jordan chains. * The number of Jordan blocks of eigenvalue $\lambda$ equals its geometric multiplicity, i.e. the dimension of its ordinary eigenspace. * For example, if $A$ has characteristic $\chi_{A}(x)=(x-1)^4$, and the eigenvalue $1$ has geometric multiplicity $2$, then there must be two Jordan blocks of diagonal $1$ in the JNF, so $\mathrm{JNF}(A)=\left(\begin{array}{c|c} \begin{array}{c c}1 & 1 \\ & 1\end{array} \\ \hline & \begin{array}{c c}1 & 1 \\ & 1\end{array} \end{array}\right)\text{ or } \left(\begin{array}{c|c} 1 \\ \hline & \begin{array}{c c}1 & 1 & \\ & 1 & 1 \\ & & 1\end{array} \end{array}\right)$ > Proof: let $J$ be the JNF, then $J-\lambda I$ would have one all-$0$ row for every Jordan block, and the dimension of $\ker (J-\lambda I)$ is exactly this. * In the minimal polynomial, the degree $k$ of $(x-\lambda)^k$ equals the size of the largest Jordan block with diagonal $\lambda$ in the JNF. * For example, if the same $A$ has a minimal polynomial of $m_{A}(x)=(x-1)^2$, then its largest Jordan block is $(2\times2)$. Hence it can't have $(1\times 1)+(3\times 3)$ blocks in the JNF, which must be $\mathrm{JNF}(A)=\left(\begin{array}{c|c} \begin{array}{c c}1 & 1 \\ & 1\end{array} \\ \hline & \begin{array}{c c}1 & 1 \\ & 1\end{array} \end{array}\right)$ * Example: solve for the Jordan basis of $A=\begin{pmatrix}1 & -1 & 0 \\-1 & 4 & -1 \\-4 & 13 & -3\end{pmatrix}$. > The characteristic polynomial is $\chi_{A}(x)=-x(x-1)^2$, giving eigenvalues $0,1$. Since basis changes preserve the characteristic, this is also the characteristic of the JNF matrix. > > So we will be looking for a JNF with one $0$ and two $1s on its diagonal. To find the Jordan basis, we find the Jordan chain for each eigenvalue. > > For $\lambda=0$, its block is $1\times 1$, so there is no need for solving the Jordan chain--it only has one element, which is the (regular) eigenvector $(1,1,3)^{T}$. > > For $\lambda=1$, it could either have one $(2\times{2})$ block or two $(1\times{1})$ ones. Since $(A-1I)$ has $\mathrm{rank}=2$, the geometric multiplicity of $1$ is $\dim\ker(A-1I)=3-2=1$, and there is only one block. > > To find the Jordan chain $\{ (A-1I)v,\, v \}$, first find $\ker(A-1I)=\langle(1,0,-1)^{T}\rangle$, and the Jordan chain has another element $v \ne \mathbf{0}:(A-I)v\in \ker(A-1I)$. > So solve for $(A-I)v=(1,0,-1)^{T}$, giving $v=(-3,-1,0)^{T}$, and the Jordan chain is $\{(1,0,-1)^{T},\,(-3,-1,0)^{T} \}$. > >Putting the two bases together we get the Jordan basis$\{ (1,1,3)^{T},\,(1,0,-1)^{T},\, (-3,-1,0)^{T}\}$, and the JNF $S^{-1}AS=\left(\begin{array}{c|c} 0 & \\ \hline& \begin{array}{c c}1 & 1 \\ & 1\end{array} \end{array}\right)$ , where $S$ has the Jordan basis as its columns. * Example (multiple blocks of the same eigenvalue): solve for the Jordan basis of $A=\begin{pmatrix}-2 & -1 & 1 \\14 & 7 & -7 \\10 & 5 & -5\end{pmatrix}$. > JNF: the characteristic polynomial is $\chi_{A}=x^3$, so the JNF is made of $J_{k}(0)$. Since $\text{null}(A-0I)=2$, there must be two Jordan blocks ($(1\times 1)+(2\times 2)$). Hence the JNF is $\left(\begin{array}{c|c} 0 & \\ \hline& \begin{array}{c c}0 & 1 \\ & 0\end{array} \end{array}\right)$. > > The Jordan chains are $\underbrace{\{ w \}}_{(1\times1)} \cup\underbrace{\{ (A-0I)v,\, v \}}_{(2 \times 2)}$ for some $v,w \in \ker (A-0I)$. > Solving $(A-0I)v=\mathbf{0}$ gives eigenvectors $(1,-2,0)^{T}$ and $(1,0,2)^{T}$, so $\ker(A-0I)=\langle (1,-2,0)^{T},(1,0,2)^{T}\rangle$. > > Larger block: Now, we can find the Jordan chain $\{ Av,v \}$ corresponding to the $(2 \times 2)$ block. We know $Av\in \ker(A-0I)$, so setting up the equations and row-reducing $A$ gives $\begin{align*} \begin{pmatrix}-2 & -1 & 1 \\14 & 7 & -7 \\10 & 5 & -5\end{pmatrix}v&=a \begin{pmatrix} 1\\-2\\0 \end{pmatrix} + b \begin{pmatrix} 1\\0\\2 \end{pmatrix},\, a,b \in \mathbb{R}\\ \begin{pmatrix} -2 & -1 & \hphantom{.}1 \\ 0&0&\hphantom{.}0 \\ 0&0&\hphantom{.}0 \end{pmatrix}v&=a \begin{pmatrix} 1\\5\\5 \end{pmatrix}+b \begin{pmatrix} 1\\7\\7 \end{pmatrix}\\ \\ \Longrightarrow v&=(1,0,0), \text{when }a=-7,\,b=5 \end{align*}$ > The other element in the chain is $(A-0I)v=(-2,14,10)^{t}$. > > The other block: the basis $w \in \ker(A-0I)$ to the $(1\times 1)$ block is quite arbitrary, as long as it is independent to $(-2,14,10)^{T}$. We can take one of the eigenvectors solved previously, say $(1,-2,0)^{T}$. > > So the Jordan basis of $A$ is $\{ (1,-2,0)^{T},\,(-2,14,10)^{T},\,(1,0,0)^{T}\}$. ## Dual Spaces ### Definitions * Given a vector space $V$ over $\mathbb F$, its **dual (space)** is $V'=\mathrm{Hom}(V,\mathbb F)$, the vector space of linear maps $V\to\mathbb F$. * The elements of a dual are called **linear functionals**. * For example, $f\to \int_{0}^{1}f \, dx$ is a linear functional in $(C[0,1])'$. * Linear functionals have rank $1$ and nullity $n-1$, and the preimage of a constant under a linear functional is an (affine) $(n-1)$-dimensional **hyperplane**. * The **dual basis** $\mathcal B'=\{ e'_{i} \}\subset V'$ of a basis $\mathcal B=\{ e_{i} \}\subset V$ is defined as: $e'_{i}:e_i'(e_j)\equiv\delta_{ij}$and it forms a basis of $V'$. * *That is, $e_i'$ is the projection map onto the $e_i$ direction.* * This definition is relative to a basis; the dual basis of a stand-alone vector is undefined, and it can be different in different bases. * For finite-dimensional vector spaces, $\dim V=\dim V'$. > Proof: the basis $B$ of $V$ and its dual basis in $V'$ have the same size. * $V'' = (V')'$ is the dual of the dual, and the **evaluation map** $E_v\in V''$ is given by: $E_v: V'\to \mathbb{F} :\ \ E_v(f)=f(v)$ * When $V$ is finite-dimensional, there is a "natural" linear isomorphism $v\to E_v$, so $V\cong V''$. * this map is independent of the basis of $V$, unlike the dual basis. * it is not an isomorphism if $V$ is infinite-dimensional. > Proof (for bijection): > Injection: we will show that $v=0 \iff E_{v}=0$: obviously if $v=0$, any linear map $f \in V'$ must be zero when evaluated by $E_{v=0}$. > Conversely, assume $E_{v}=0$, so $\forall f \in V',\, f(v)=0$. Assume for contradiction that $v \ne 0$: we can extend $v$ to a basis $\mathcal{B}$ of $V$ and take the dual basis $\mathcal{B}' \ni v'$; but $E_{v}(v')=v'(v)=1$, a contradiction. > > Surjection follows since $\dim V''=\dim V' = \dim V$. ### Annihilators * Given a subspace $U\le V$, its **annihilator** is $U^0=\big\{f\in V':f(U)=\{0\}\big\}$. * The annihilator is a subspace of $V'$. > Proof: apply the subspace test (closure, contains the zero function). * $U^0 \cong (V/U)'$, hence for finite-dimensional vector spaces, $\dim U^0=\dim V-\dim U$ > Proof: $\phi:f\mapsto f',\,f'(v+U) \equiv f(v)$ is an linear injection. It is a bijection since it has inverse $\phi^{-1}(g)=g^{*},\,g^*(v)=g(v+U)$, which can be shown to be linear and maps into $U^0$. > > The dimension equality then follows for finite vector spaces. * If $V$ is finite-dimensional, the natural isomorphism $V\to V''$ given by $v\to E_v$ is also an isomorphism $U\cong U^{00}$, where $U\le V$. > Proof: $U$ is linearly mapped into $U^{00}$, as $\{ E_u: u\in U \}$ annihilates $U^0$. > Applying the previous result twice shows that the image has the same dimension as $U^{00}$, hence the two are the same, and the map is a bijection. ### Dual Maps * A linear map $T:V\to W$ has a **dual map** $T':W'\to V'$ given by pre-composition with $T$: $T'(f)\equiv f\circ T$ $\begin{align*} &\ V\xrightarrow {T'(f)}\mathbb{F}\\T &\downarrow \\ &W\xrightarrow{\hphantom{T}f\hphantom{'()}}\mathbb F\end{align*}$*(this is supposed to be a triangle, but IDK how to draw one)*. * Assigning $T\to T'$ gives an isomorphism $\mathrm{Hom}(V,W) \cong \mathrm{Hom}(W',V')$ when $V,W$ are finite dimensional. > Proof: check for linearity. > For injectivity, we prove that $T'=0$ requires $T=0$: $\begin{align*} T'=0_{\mathrm{Hom}(W',V')} &\iff \forall f\in W':T'(f)=0_{ V'}\\ &\iff \forall f\in W',v\in V: f(Tv)=0\\ &\iff \forall v \in V:\,E_{Tv}=0_{W''}\\ &\iff \forall v\in V: Tv=0_{W}\\ &\iff T=0_{\mathrm{Hom}(V,W)} \end{align*}$And in finite-dimensional spaces, it's bijective since $\dim \mathrm{Hom}(V,W)=\dim \mathrm{Hom}(W',V')$. * With linear map $T: V \to W$, finite bases $\mathcal B_V,\mathcal B_W,$ and their dual bases, there is: $_{\mathcal B_W}[T]_{\mathcal B_V}^t = \ _{\mathcal B_V'}[T']_{\mathcal B_W'}$ > Proof: write $\mathrm{LHS}^t=(a_{ij}),\mathrm{RHS}=(b_{ij})$, then check $a_{ij}$ and $b_{ji}$ are the same by writing them in terms of dual bases, $T$, and $T'$. ## Inner Product Spaces ### Definitions - A real vector space $V$ equipped with a positive-definite, symmetric bilinear form $\left< *,* \right>$ is called a **(real) inner product space**. * Positive-definiteness: $\left< v,v \right> \ge 0$, equality only when $v=\mathbf{0}$. * Symmetry: $\left< v,w \right> = \left< w,v \right>$ * Bilinearity: $\left< u+v,w \right> = \left< u,w \right> + \left< v,w \right>$, $\left< \lambda u,v \right> = \lambda\left< u,v \right>$. * A complex vector space equipped with a positive-definite, conjugate-symmetric sesquilinear form is called a **(complex) inner product space**. * Conjugate-symmetry: $\langle v,w\rangle=\overline{\langle w,v \rangle}$ * Sesquilinearity: $\langle \bar{\lambda} v,w\rangle=\lambda\langle v,w\rangle=\langle v,\lambda w \rangle$. * Vectors $\{ v_{i} \}$ are **orthogonal** if $\left< v_{i},v_{j \ne i} \right> = 0$, and they are **orthonormal** if $\left< v_{i},v_{j} \right> = \delta_{ij}$. * Orthogonal (and non-$0$) vectors are linearly independent (in either real and complex IPS). > Proof: if a linear combination of those vectors is $0$, then taking the inner product with each of them shows that the coefficients are all $0$. ### Gram-Schmidt in IPS *Big idea: Gram-Schmidt generates an othonormal basis by removing all the non-orthogonal components a basis vector has in the direction of others.* * The **projection** of $v$ onto $w$ is the component $v$ has in the direction of $w$, if they are ordinary vectors in $\mathbb{R}^n$; in general, $p_{w}(v)=\bigg(\underbrace{ \frac{\langle w,v \rangle}{\sqrt{ \langle w,w \rangle }} }_{\text{projection length}}\bigg)\bigg(\underbrace{ \frac{w}{\sqrt{ \langle w,w \rangle }}}_{\text{along } w}\bigg )=\frac{\langle w,v \rangle}{\langle w,w \rangle}w$ * Then $v-p_{w}(v)$ is orthogonal to $w$. Intuitively, what remains from $-p_{w}(v)$ should indeed be "unrelated" to $w$. * The **Gram-Schmidt Process**: given independent vectors $v_{1},\dots,v_{n}$ in an inner product space $V$, we can construct an orthogonal basis $w_{1}, \dots, w_{n}$ given by the recursive formula $w_{i}=v_{i}-\sum_{k=1}^{i-1}p_{w_{k}}(v_{i})$*that is, we add $v_{i}$ to the new basis after making it orthogonal to each of $w_{1}, \dots, w_{i-1}$.* * If we wish, taking $w_{i}^{*}=w_{i} / \sqrt{ \langle w_{i}, w_{i}\rangle }$ gives an orthonormal basis. * Hence every real or complex IPS has an othonormal basis. ### Duals and Orthogonal Complements * The map $v\to \langle v,*\rangle$ is an injective linear map $V\to V'$, hence an isomorphism $V \cong V'$ when $V$ is finite in dimension. > Proof: linearity is easily confirmed. > Injectivity can be derived from non-degeneracy: $\langle v,* \rangle$ is not the zero function (in particular, $\langle v,v \rangle \ne 0$) unless $v=0$ due to positive-definiteness. * The **orthogonal complement** $U^{\perp}$ of a subspace $U\le V$, $V$ an IPS is $U^{\perp}=\{ v\in V: \langle U, v \rangle=\{ 0 \}\}$ * The orthogonal complement is a subspace. * A subspace and its orthogonal complement form a direct sum in finite-dimensional vector spaces. > Proof: take $U\le V$, orthonormal basis $B_{U}$ and extend it to an orthonormal basis $B_{V}$ of $V$. We will prove that $B_{C}=B_{V}\setminus B_{U}$ is a basis of $U^\perp$, which completes the proof. > > Spanning: for any vector $v \in U^\perp$, decompose it in terms of $B_{V}$. Inner product $\langle v,u_{i} \rangle$ with $u_{i}\in B_{U}$ shows that their coefficients are $0$, so $U^{\perp}\subseteq\langle B_{c} \rangle$. > Independence: immediate from $B_{c}$ being a subset of a basis. > Containment: due to orthonormality, $B_{c} \subset U^{\perp}$. * If $V$ is a finite dimensional IPS with $U\le V$, $U^{\perp}\cong U^0$. > Proof: consider the map $w\to \langle w, *\rangle$ for $w \in U^{\perp}$. > It maps into $U^0$ by definition of orthogonal complements. > Its injectiveness and linearity is proven above in $V \cong V':v \to \langle v,\ast \rangle$. > Since $\dim U^{\perp}=\dim U^{0}$ in finite-dimensional $V$, it is bijective. ### Adjoints * Given map $T: V\to V$, $V$ a real/complex IPS, the **adjoint** of $T$ is a map $T^*$ such that $\forall v,w\in V:\langle v, Tw \rangle= \langle T^{*}v,w \rangle$ * Adjoints of a map, if any exist, are unique. > Proof: if $T^{*},T^{+}$ are both adjoints, then $\forall v,w \in V,\, \,\left< T^{*}v-T^{+}v,w \right> = 0 $then choose $w=T^{*}v-T^{+}v$, and non-degeneracy of the inner product requires $T^{*}v-T^{+}v=\mathbf{0}$ for any $v$. * In a finite-dimensional IPS, the adjoint always exists and is linear. > Proof: for each $v$, we find a suitable $u=T^{*}v$ to satisfy the definiton. > The map $v \mapsto \langle v,T(*)\rangle$ is a linear map $V\to V'$, and since $V$ is finite dimensional, the isomorphism $V\cong V'$ guarantees $\exists! u:\langle u,*\rangle=\langle v, T(*)\rangle$. > Define $T^{*}v\equiv u$, then verify linearity. * Under an orthonormal basis $\mathcal{B}$ of $V$, the matrices of a map and its adjoint are conjugate transposes of each other. * *That is, adjoint of a map is equivalent to the transpose of a matrix*. $_{\mathcal{B}}[T^{*}]_{\mathcal{B}}=(\overline{_{\mathcal{B}}[T]_{\mathcal{B}}})^t$ > Proof: let $\mathcal{B}=\{ e_{1},\dots, e_{n} \},\,\mathrm{LHS}=(a_{ij}),\, \overline{\mathrm{RHS}}^{t}= \hphantom{}_{\mathcal{B}}[T]_{\mathcal{B}}=(b_{ij})$. > Then $a_{ij}=\langle e_{i},\, T^{*}e_{j}\rangle=\overline{\langle T^{*}e_{j}, e_{i} \rangle}=\overline{\langle e_{j}, Te_{i} \rangle}=\overline{ b_{ji}}$. * A map is **self-adjoint** if it is its own adjoint. * *Self adjoints are the IPS equivalent for symmetric matrices*. * Self-adjoint maps have and only have real eigenvalues. > Proof: linear map $T$ always has eigenvalue $\lambda \in \mathbb{C}$. > Take $w \ne \mathbf{0}:Tw=\lambda w$, then $\begin{align*} \lambda\langle w,w \rangle&= \left< w,\lambda w \right> \\ &= \left< w,Tw \right> \\ &= \left< T^{*}w,w \right>\\ &= \left< Tw,w \right> \\ &= \left< \lambda w,w \right> \\ &= \bar{\lambda} \langle w,w \rangle \end{align*}$so $\lambda \in \mathbb{R}$. * If $U$ is invariant under a self-adjoint map $T$, so is $U^\perp$. > Proof: if $u \in U,w\in U^\perp$, then $Tw\in U^\perp$ because $\langle u,Tw \rangle=\langle T^{*}u, w \rangle=\langle Tu, w \rangle=0$. * **Spectral Theorem** (for self-adjoint maps): self-adjoint maps in finite dimensional vector spaces over $\mathbb{R}$ can be diagonalized. That is, there exists a real orthonormal basis made of its eigenvectors. > Proof: say $T$ self-adjoint in IPS $V$ with dimension $n$. > Take any eigenvalue/vector pair $\lambda,v:Tv=\lambda v$. The span $\langle v \rangle$ is $T$-invariant, so is $\langle v \rangle^\perp$. Now $T$ is still self-adjoint when restricted to $\langle v \rangle^\perp$, so we add $v$ to the basis and work with $\langle v \rangle^\perp$, an IPS with dimension $(n-1)$. > Induction on $n$, the dimension of the IPS, gives an orthgonal basis made up of the eigenvectors of $T$. Normalizing each vector gives an orthonormal basis. ### Orthogonal and Unitary Maps * In an IPS $V$ with an orthonormal basis $\{ e_{1},\dots, e_{n} \}$, if a matrix $A$ has these basis vectors in its columns, it satisfies $A \bar{A^t}=\bar{A^t}A=I$and matrices satisfying this equation are called **orthogonal** in a real IPS, or **unitary** in a complex IPS. * A linear map $T$ is called orthogonal/unitary if $T^{*}=T^{-1}$. * $T$ is orthogonal/unitary $\iff$ $T$ preserves inner products. * this implies that all the eigenvalues of $T$ have modulus $1$ * The length function determines the inner product: knowing $\langle v,v \rangle_{1}=\langle v,v \rangle_{2}$ determines $\langle v,w \rangle_{2}=\langle v,w \rangle_{2}$ for any $v,w \in V$. > Proof: expand $\langle v+w,v+w \rangle$, so that $\langle v,w \rangle$ can be expressed in terms of lengths. Hence determining the length function also determines the inner product. > > For complex vector fields, the above only determines the real part; to show the same for the imaginary part, consider $\langle v+iw, v+iw \rangle$. * If $T$ is orthogonal/unitary, then if $U$ is $T$-invariant, so is $U^\perp$. > Proof: same strategy as in the proof for self-adjoint maps. > Note that $T^{-1}$ is a polynomial of $T$: let $m_{T}=a_{0}+xm^{*}(x)$, then $Tm^{*}(T)=-a_{0}I$, and $T^{-1}=-m^{*}(T) / a_{0}$. Hence $U$ is also invariant under $T^{-1}$. * **Spectral Theorem** (for unitary maps): unitary maps in finite dimensional complex vector spaces can be diagonalized. That is, there exists an orthonormal basis made of its eigenvectors. * Note that if $T$ is an orthogonal map in a real vector space, it might not be diagonalizable even though it is a unitary map in the corresponding complex vector space. > Proof: again, follow the same strategy as in the proof for self-adjoints. * Instead of a full diagonalization, orthogonal maps in finite dimensional real vector spaces can be made into a block diagonal, with each block either be $\pm I$ or a rotation-by-$\theta$ matrix. > Proof: see the lecture notes. I'm feeling lazy.