> [!definition|*] Modes of Convergence
> A sequence of variables $(X_n)$ **converges almost surely** to $X$, written as $X_n\to X\text{ a.s.}$, if$\mathbb P(X_n\to X):=\mathbb P\Big\{\omega:\big(X_n(\omega)\big)\to X(\omega)\Big\}=1.$
>
> $(X_n)$ **converges in probability** to $X$, written as $X_n\xrightarrow{P}X$, if$\mathbb P(|X_n-X|<\epsilon)\to 1,\ \forall \epsilon>0.$
>
$(X_n)$ **converges in distribution** to $X$, written as $X_n\xrightarrow{d}X$, if $\forall x \text{ where } F\text{ continuous},\, F_n(x)\to F(x).$There is no requirement at points where $F$ is discontinuous.
### Strength of Convergences
> [!theorem|*] Modes of Convergence
> Convergence a.s. $\,\xRightarrow{(1)}\,$ convergence in probability $\,\xRightarrow{(2)}\,$ convergence in distribution.
> The converses are not true in general.
>
> > [!proof]- Proofs of $(1)$
> > > [!proof]- Basic proof in lecture notes.
> > > Fix $\epsilon > 0$, define $A_N:\{\omega \,|\, \forall n \ge N,~|X_n(\omega)-X(\omega)|<\epsilon\}$,
> > > Then almost-sure convergence requires $A_N$ occurs at least for some $N$, so $\mathbb P(\bigcup A_N)=1$.
> > > Since $A_N$ is an increasing sequence of events, MCT gives $\mathbb P(A_N)\to 1$.
> > > Since $A_N$ implies $|X_{n \ge N}-X|<\epsilon,$ convergence in probability follows.
> >
> > > [!proof]- Alternative proof of $(1)$ (requires B8.1)
> > > This is essentially Fatou's lemma. Consider the failure events $B_{n}:= \{ | X_{n}-X | > \epsilon \}$, then $X_{n} \not \to X \iff B_{n}~\mathrm{i.o.} \iff \underset{n}{\lim\sup}B_{n},$hence the three all have probability $0$ under $\mathrm{a.s.}$ convergence. Then Fatou gives $0 =\mathbb{P}[\underset{n}{\lim\sup}B_{n}] \ge \underset{n}{\lim\sup}~\mathbb{P}[B_{n}],$hence $\mathbb{P}[B_{n}] \to 0$, and $X_{n} \to X$ in probability.
>
> > [!proof]- Counterexample of $(1)$ converse
> > Consider independent random variables $X_n$ with distribution $
> \mathbb{P}(X_{n}=0)=\frac{n-1}{n},\,\,
> \mathbb P(X_n=1)=\frac{1}{n}
> $Then $X_n\xrightarrow{P}0$, but $\mathbb P(\forall n\ge N: X_n=0)=0$ for any $N$ by [[Measure Theory#Borel-Cantelli Lemmas|Borel-Cantelli II]], so almost-sure convergence fails.
>
> > [!proof]- Proof of $(2)$
> >Since $X_n<x$ implies $X<x+\epsilon$ or $|X-X_n|\ge \epsilon$ for any positive $\epsilon,$
> >$\begin{align*} F_n(x)&=\mathbb P(X_n\le x)\\
> &\le\mathbb P\big[(X<x+\epsilon)\text{ OR }(|X-X_n|\geq\epsilon)\big]\\
> &\le \mathbb P(X<x+\epsilon)+\mathbb P(|X-X_n
> |>\epsilon)
> \\&\to F(x+\epsilon)\end{align*}$
> > Similar logic applied to $1-F_n(x)=\mathbb P(X_n> x)$ gives $F_{n}(x) \ge F(x-\epsilon)$.
> > Continuity of $F$ at $x$ makes $F(x \pm \epsilon) \to F(x)$, so they sandwich $F_n(x)\to F(x)$.
>
> > [!proof]- Counterexample of $(2)$ converse
> > Let $Y,X_1, X_2,\dots iid.$ (for example $N(0,1)$), then $X_n\xrightarrow{d} Y,$ but $\nrightarrow Y$ in probability.
>
- That is, a sequence $X_n$ converging in probability is likely to be within a disk around $X$, but not necessarily likely enough to converge (it could pop outside every now and then).
- For convergence in distribution, $X_n$ will have a similar distribution to $X$, but they could be completely independent.
Nevertheless, $(2)$ converse is true if $X_{n} \xrightarrow{d} c$ constant; in that case $X_{n} \xrightarrow{P}c$.
> [!proof]-
> Fix any $\epsilon > 0$. Then $\begin{align*}
\lim_{ n \to \infty } \mathbb{P}[|X_{n}-c| < \epsilon]&= \lim_{ n \to \infty } \Big[F_{n}(c+\epsilon)-F_{n}(c-\epsilon)\Big]\\
&= \lim_{ n \to \infty } [F_{n}(c+\epsilon)] - \lim_{ n \to \infty } [F_{n}(c-\epsilon)]\\
&= F(c+\epsilon) - F(c-\epsilon)\\
&= 1
\end{align*}$where the third equality uses $F_{n}(c \pm \epsilon) \to F(c \pm \epsilon)$; this is because $F$ is continuous anywhere bar $x=c$, in particular $c \pm \epsilon$.
## Convergence in $\mathcal{L}^{p}$ Spaces
> [!tldr]
> - $\mathcal{L}^{p}$ convergence $\Longrightarrow$ convergence in probability; for the reverse, we need uniform integrability.
> - Neither of $\mathcal{L}^{p}$ and $\mathrm{a.s.}$ convergence imply the other.
> [!definition|*] Convergence in $\mathcal{L}^{p}$ Spaces
> The sequence $(X_{n}) \to X$ in $\mathcal{L}^p$ if they are all $\mathcal{L}^{p}$-integrable, and $\mathbb{E}[| X_{n}-X |^{p}] \to 0.$
> [!theorem|*] $\mathcal{L}^1$ Convergence Implies Convergence in Probability
> If $(X_{n}) \xrightarrow{\mathcal{L}^{1}}X$, then $(X_{n}) \xrightarrow{p}X$.
> > [!proof]-
> > Fix $\epsilon$ and condition on the event $A_{n}:=\{ | X_{n}-X | \ge \epsilon \}$.
> > $\begin{align*}
\mathbb{E}[| X_{n}-X |]&= \mathbb{E}[| X_{n}-X |\mathbf{1}_{A_{n}}]+\mathbb{E}[| X_{n}-X |\mathbf{1}_{A_{n}^{c}}]\\
&\ge \epsilon \cdot \mathbb{P}[A_{n}]+0.
\end{align*}$
> Requiring this to $\to 0$ (for $\mathcal{L}^{1}$ convergence) forces $\mathbb{P}[A_{n}] \to 0$.
>
> > [!warning] Reverse is not true
> >
For example, for independent variables $X_{n} \sim F_{n}=\frac{n-1}{n}\delta_{0}+\frac{1}{n}\delta_{n^{2}},$they converges to $0$ in probability, but $\mathbb{E}[X_{n}]=n \to \infty$.
> [!theorem|*] $\mathcal{L}^{1}$ convergence for conditional expectations
> If $(X_{n}) \overset{\mathcal{L}^{1}}{\to}X$, then for any sub-algebra $\mathcal{G}$, $(\mathbb{E}[X_{n} ~|~G]) \overset{\mathcal{L}^{1}}{\to}\mathbb{E}[X ~|~ G].$
> > [!proof]-
> > Immediate from the fact that [[Conditional Expectations#^8d034f|conditional expectations are contractions in $\mathcal{L}^1$]].
Moreover, $\mathcal{L}^{p}$ and $\mathrm{a.s.}$ convergence do not imply one another:
- $\mathcal{L}^{p} \not \Rightarrow \mathrm{a.s.}$: consider $X_{n} \sim F_{n}=\frac{n-1}{n}\delta_{0}+\frac{1}{n}\delta_{1}$, then $\mathbb{E}[X_{n}]=\frac{1}{n} \to 0$, but $X_{n}$ diverges almost surely.
- $\mathrm{a.s.} \not \Rightarrow \mathcal{L}^{p}$: consider $X_{n} \sim F_{n} =\frac{n^{2}-1}{n^{2}}\delta_{0}+\frac{1}{n^{2}}\delta_{n^{2}}$, so by BC1 $X_{n} \to 0 ~\mathrm{a.s.}$, but $\mathbb{E}[X_{n}]=1 \not \to 0$.
### Uniform Integrability
> [!definition|*] Uniform Integrability
> A family of random variables $\mathcal{C} \subseteq \mathcal{L}^{1}$ is **uniformly integrable** (UI) if $\sup_{X \in \mathcal{C}}\mathbb{E}[|X| \mathbf{1}_{|X| \ge K}] \to 0$when $K \to \infty$.
Uniform integrability is what us required to go from convergence in probability to $\mathcal{L}^{p}$ convergence: the counterexample above has $\mathbb{P}[| X_{n}-X | \ge \epsilon] \to 0$, but slower than how $X_{n} \to \infty$ when $| X_{n}-X | > \epsilon$, leading to divergence in $\mathcal{L}^{p}$.
> [!theorem|*] Vitali's Convergence Theorem
> If $(X_{n}) \subseteq \mathcal{L}^{1}$, then if $(X_{n}) \to X$ in probability, the following are equivalent:
> - $[1]$ $(X_{n})$ is UI.
> - $[2]$ $X \in \mathcal{L}^{1}$, and $(X_{n})\to X$ in $\mathcal{L}^{1}$.
> - $[3]$ $X \in \mathcal{L}^{1}$, and $\mathbb{E}[| X_{n} |] \to \mathbb{E}[| X |] < \infty$.
>
> > [!proof]
> > *$[1] \Rightarrow [2]$* Firstly since $(X_{n}) \xrightarrow{p}X$, there is a subsequence $(X_{n_{k}}) \to X ~\mathrm{a.s.}$, and for that subsequence $\mathbb{E}[| X |]=\mathbb{E}[\liminf | X_{n_{k}} |] \le \liminf\mathbb{E}[| X_{n_{k}} |]\le \sup \mathbb{E}[| X_{n} |],$last of which is finite since $(X_{n})$ is UI. Therefore $X$ is integrable.
> >
> > Now for convergence, again partition based on $A_{n}:= \{ | X_{n}-X | \ge \epsilon \}$: $\begin{align*}
\mathbb{E}[| X_{n}-X |]
&= \underbrace{\mathbb{E}[| X_{n}-X |\mathbf{1}_{A_{n}}]}_{\to 0}+\underbrace{\mathbb{E}[| X_{n}-X |\mathbf{1}_{A_{n}^{c}}]}_{\le \epsilon} \end{align*},$so $\mathbb{E}[| X_{n}-X |] \to 0$.
> > *$[2] \Rightarrow [3]$* Because $-| X_{n}-X | \le | X_{n} |-| X | \le | X_{n}-X |$.
> >
### Uniform Integrability and $\mathcal{L}^{p}$ Boundedness
> [!tldr]
> For a family of variables $\mathcal{C}$, $[\mathcal{C} \text{ unif. bounded in }\mathcal{L}^{p}, ~p > 1] \Longrightarrow [\mathcal{C} \text{ is UI}] \Longrightarrow [\mathcal{C} \text{ unif. bounded in }\mathcal{L}^{1}],$where "uniformly bounded" in $\mathcal{L}^{p}$ means that there is a bound of $\mathbb{E}[| X |^{p}]$ that holds for all $X \in \mathcal{C}$.
*Uniform integrability implies uniform boundedness* in $\mathcal{L}^{1}$: if $\mathcal{C}$ is uniformly integrable, then there is a bound $L$ where $\mathbb{E}[| X |] \le L$ for any $X \in \mathcal{C}$.
> [!proof]-
> For any $X \in \mathcal{C}$, fix $\epsilon=1$ and corresponding $K$ from UI, then $\begin{align*}
\mathbb{E}[| X |]&= \mathbb{E}[| X | \mathbf{1}_{| X | \ge K}]+\mathbb{E}[| X | \mathbf{1}_{| X | < K}]\\
&\le \epsilon + K.
\end{align*}$So we can take $L := \epsilon + K$.
However, the converse is not true: for example the family $X_{n}:= n \mathbf{1}_{[0,1 / n]}$ in $([0,1], \mathcal{B}_{[0,1]}, \mathrm{Leb})$ is uniformly bounded in $\mathcal{L}^{1}$ by $1$, but is not uniformly integrable.
*Nevertheless, uniform boundedness in $\mathcal{L}^{p> 1}$ implies UI*:
> [!proof]-
> Suppose $| X |$ is bounded by $L$ for all $X \in \mathcal{C}$.
> Consider the inequality $v^{1-p} \le K^{1-p} \iff v \le K^{1-p}v^{p}$ when $v \ge K, p > 1$, and apply it to $v=| X |$ to get $\mathbb{E}[| X |\mathbf{1}_{| X | \ge K}] \le \mathbb{E}[K^{1-p}| X |^{p} \mathbf{1}_{| X | \ge K}] \le K^{1-p}L^{p}, $so there is $K$ large enough (say $K \ge (L^{-p}\epsilon)^{1 / (1-p)}$) for UI.
## Convergence Laws
> [!theorem|*] Strong Law of Large Numbers
> The **Strong Law of Large Numbers** states that given random variables $X_{1,\cdots, n}$ i.i.d. with mean $\mu$, then their average $\bar{X}=\frac{\sum X}{n}\to\mu\text{ a.s.}.$
> [!theorem|*] Central Limit Theorem
> The **Central Limit Theorem** states that under the same conditions as above, with $\sigma^2$ being the variance, there is: $\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}=\frac{\sum_{i} X_{i}-n\mu}{\sigma\sqrt{n}}\xrightarrow{d}N(0,1).$
- So roughly, $M_n \approx N(\mu, \frac{\sigma}{\sqrt{n}})$.