Convergence of Random Variables - Random Notes Go Brrrrrrr

> [!definition|*] Modes of Convergence > A sequence of variables $(X_n)$ **converges almost surely** to $X$, written as $X_n\to X\text{ a.s.}$, if$\mathbb P(X_n\to X):=\mathbb P\Big\{\omega:\big(X_n(\omega)\big)\to X(\omega)\Big\}=1.$ > > $(X_n)$ **converges in probability** to $X$, written as $X_n\xrightarrow{P}X$, if$\mathbb P(|X_n-X|<\epsilon)\to 1,\ \forall \epsilon>0.$ > $(X_n)$ **converges in distribution** to $X$, written as $X_n\xrightarrow{d}X$, if $\forall x \text{ where } F\text{ continuous},\, F_n(x)\to F(x).$There is no requirement at points where $F$ is discontinuous. ### Strength of Convergences > [!theorem|*] Modes of Convergence > Convergence a.s. $\,\xRightarrow{(1)}\,$ convergence in probability $\,\xRightarrow{(2)}\,$ convergence in distribution. > The converses are not true in general. > > > [!proof]- Proofs of $(1)$ > > > [!proof]- Basic proof in lecture notes. > > > Fix $\epsilon > 0$, define $A_N:\{\omega \,|\, \forall n \ge N,~|X_n(\omega)-X(\omega)|<\epsilon\}$, > > > Then almost-sure convergence requires $A_N$ occurs at least for some $N$, so $\mathbb P(\bigcup A_N)=1$. > > > Since $A_N$ is an increasing sequence of events, MCT gives $\mathbb P(A_N)\to 1$. > > > Since $A_N$ implies $|X_{n \ge N}-X|<\epsilon,$ convergence in probability follows. > > > > > [!proof]- Alternative proof of $(1)$ (requires B8.1) > > > This is essentially Fatou's lemma. Consider the failure events $B_{n}:= \{ | X_{n}-X | > \epsilon \}$, then $X_{n} \not \to X \iff B_{n}~\mathrm{i.o.} \iff \underset{n}{\lim\sup}B_{n},$hence the three all have probability $0$ under $\mathrm{a.s.}$ convergence. Then Fatou gives $0 =\mathbb{P}[\underset{n}{\lim\sup}B_{n}] \ge \underset{n}{\lim\sup}~\mathbb{P}[B_{n}],$hence $\mathbb{P}[B_{n}] \to 0$, and $X_{n} \to X$ in probability. > > > [!proof]- Counterexample of $(1)$ converse > > Consider independent random variables $X_n$ with distribution $ > \mathbb{P}(X_{n}=0)=\frac{n-1}{n},\,\, > \mathbb P(X_n=1)=\frac{1}{n} > $Then $X_n\xrightarrow{P}0$, but $\mathbb P(\forall n\ge N: X_n=0)=0$ for any $N$ by [[Measure Theory#Borel-Cantelli Lemmas|Borel-Cantelli II]], so almost-sure convergence fails. > > > [!proof]- Proof of $(2)$ > >Since $X_n<x$ implies $X<x+\epsilon$ or $|X-X_n|\ge \epsilon$ for any positive $\epsilon,$ > >$\begin{align*} F_n(x)&=\mathbb P(X_n\le x)\\ > &\le\mathbb P\big[(X<x+\epsilon)\text{ OR }(|X-X_n|\geq\epsilon)\big]\\ > &\le \mathbb P(X<x+\epsilon)+\mathbb P(|X-X_n > |>\epsilon) > \\&\to F(x+\epsilon)\end{align*}$ > > Similar logic applied to $1-F_n(x)=\mathbb P(X_n> x)$ gives $F_{n}(x) \ge F(x-\epsilon)$. > > Continuity of $F$ at $x$ makes $F(x \pm \epsilon) \to F(x)$, so they sandwich $F_n(x)\to F(x)$. > > > [!proof]- Counterexample of $(2)$ converse > > Let $Y,X_1, X_2,\dots iid.$ (for example $N(0,1)$), then $X_n\xrightarrow{d} Y,$ but $\nrightarrow Y$ in probability. > - That is, a sequence $X_n$ converging in probability is likely to be within a disk around $X$, but not necessarily likely enough to converge (it could pop outside every now and then). - For convergence in distribution, $X_n$ will have a similar distribution to $X$, but they could be completely independent. Nevertheless, $(2)$ converse is true if $X_{n} \xrightarrow{d} c$ constant; in that case $X_{n} \xrightarrow{P}c$. > [!proof]- > Fix any $\epsilon > 0$. Then $\begin{align*} \lim_{ n \to \infty } \mathbb{P}[|X_{n}-c| < \epsilon]&= \lim_{ n \to \infty } \Big[F_{n}(c+\epsilon)-F_{n}(c-\epsilon)\Big]\\ &= \lim_{ n \to \infty } [F_{n}(c+\epsilon)] - \lim_{ n \to \infty } [F_{n}(c-\epsilon)]\\ &= F(c+\epsilon) - F(c-\epsilon)\\ &= 1 \end{align*}$where the third equality uses $F_{n}(c \pm \epsilon) \to F(c \pm \epsilon)$; this is because $F$ is continuous anywhere bar $x=c$, in particular $c \pm \epsilon$. ## Convergence in $\mathcal{L}^{p}$ Spaces > [!tldr] > - $\mathcal{L}^{p}$ convergence $\Longrightarrow$ convergence in probability; for the reverse, we need uniform integrability. > - Neither of $\mathcal{L}^{p}$ and $\mathrm{a.s.}$ convergence imply the other. > [!definition|*] Convergence in $\mathcal{L}^{p}$ Spaces > The sequence $(X_{n}) \to X$ in $\mathcal{L}^p$ if they are all $\mathcal{L}^{p}$-integrable, and $\mathbb{E}[| X_{n}-X |^{p}] \to 0.$ > [!theorem|*] $\mathcal{L}^1$ Convergence Implies Convergence in Probability > If $(X_{n}) \xrightarrow{\mathcal{L}^{1}}X$, then $(X_{n}) \xrightarrow{p}X$. > > [!proof]- > > Fix $\epsilon$ and condition on the event $A_{n}:=\{ | X_{n}-X | \ge \epsilon \}$. > > $\begin{align*} \mathbb{E}[| X_{n}-X |]&= \mathbb{E}[| X_{n}-X |\mathbf{1}_{A_{n}}]+\mathbb{E}[| X_{n}-X |\mathbf{1}_{A_{n}^{c}}]\\ &\ge \epsilon \cdot \mathbb{P}[A_{n}]+0. \end{align*}$ > Requiring this to $\to 0$ (for $\mathcal{L}^{1}$ convergence) forces $\mathbb{P}[A_{n}] \to 0$. > > > [!warning] Reverse is not true > > For example, for independent variables $X_{n} \sim F_{n}=\frac{n-1}{n}\delta_{0}+\frac{1}{n}\delta_{n^{2}},$they converges to $0$ in probability, but $\mathbb{E}[X_{n}]=n \to \infty$. > [!theorem|*] $\mathcal{L}^{1}$ convergence for conditional expectations > If $(X_{n}) \overset{\mathcal{L}^{1}}{\to}X$, then for any sub-algebra $\mathcal{G}$, $(\mathbb{E}[X_{n} ~|~G]) \overset{\mathcal{L}^{1}}{\to}\mathbb{E}[X ~|~ G].$ > > [!proof]- > > Immediate from the fact that [[Conditional Expectations#^8d034f|conditional expectations are contractions in $\mathcal{L}^1$]]. Moreover, $\mathcal{L}^{p}$ and $\mathrm{a.s.}$ convergence do not imply one another: - $\mathcal{L}^{p} \not \Rightarrow \mathrm{a.s.}$: consider $X_{n} \sim F_{n}=\frac{n-1}{n}\delta_{0}+\frac{1}{n}\delta_{1}$, then $\mathbb{E}[X_{n}]=\frac{1}{n} \to 0$, but $X_{n}$ diverges almost surely. - $\mathrm{a.s.} \not \Rightarrow \mathcal{L}^{p}$: consider $X_{n} \sim F_{n} =\frac{n^{2}-1}{n^{2}}\delta_{0}+\frac{1}{n^{2}}\delta_{n^{2}}$, so by BC1 $X_{n} \to 0 ~\mathrm{a.s.}$, but $\mathbb{E}[X_{n}]=1 \not \to 0$. ### Uniform Integrability > [!definition|*] Uniform Integrability > A family of random variables $\mathcal{C} \subseteq \mathcal{L}^{1}$ is **uniformly integrable** (UI) if $\sup_{X \in \mathcal{C}}\mathbb{E}[|X| \mathbf{1}_{|X| \ge K}] \to 0$when $K \to \infty$. Uniform integrability is what us required to go from convergence in probability to $\mathcal{L}^{p}$ convergence: the counterexample above has $\mathbb{P}[| X_{n}-X | \ge \epsilon] \to 0$, but slower than how $X_{n} \to \infty$ when $| X_{n}-X | > \epsilon$, leading to divergence in $\mathcal{L}^{p}$. > [!theorem|*] Vitali's Convergence Theorem > If $(X_{n}) \subseteq \mathcal{L}^{1}$, then if $(X_{n}) \to X$ in probability, the following are equivalent: > - $[1]$ $(X_{n})$ is UI. > - $[2]$ $X \in \mathcal{L}^{1}$, and $(X_{n})\to X$ in $\mathcal{L}^{1}$. > - $[3]$ $X \in \mathcal{L}^{1}$, and $\mathbb{E}[| X_{n} |] \to \mathbb{E}[| X |] < \infty$. > > > [!proof] > > *$[1] \Rightarrow [2]$* Firstly since $(X_{n}) \xrightarrow{p}X$, there is a subsequence $(X_{n_{k}}) \to X ~\mathrm{a.s.}$, and for that subsequence $\mathbb{E}[| X |]=\mathbb{E}[\liminf | X_{n_{k}} |] \le \liminf\mathbb{E}[| X_{n_{k}} |]\le \sup \mathbb{E}[| X_{n} |],$last of which is finite since $(X_{n})$ is UI. Therefore $X$ is integrable. > > > > Now for convergence, again partition based on $A_{n}:= \{ | X_{n}-X | \ge \epsilon \}$: $\begin{align*} \mathbb{E}[| X_{n}-X |] &= \underbrace{\mathbb{E}[| X_{n}-X |\mathbf{1}_{A_{n}}]}_{\to 0}+\underbrace{\mathbb{E}[| X_{n}-X |\mathbf{1}_{A_{n}^{c}}]}_{\le \epsilon} \end{align*},$so $\mathbb{E}[| X_{n}-X |] \to 0$. > > *$[2] \Rightarrow [3]$* Because $-| X_{n}-X | \le | X_{n} |-| X | \le | X_{n}-X |$. > > ### Uniform Integrability and $\mathcal{L}^{p}$ Boundedness > [!tldr] > For a family of variables $\mathcal{C}$, $[\mathcal{C} \text{ unif. bounded in }\mathcal{L}^{p}, ~p > 1] \Longrightarrow [\mathcal{C} \text{ is UI}] \Longrightarrow [\mathcal{C} \text{ unif. bounded in }\mathcal{L}^{1}],$where "uniformly bounded" in $\mathcal{L}^{p}$ means that there is a bound of $\mathbb{E}[| X |^{p}]$ that holds for all $X \in \mathcal{C}$. *Uniform integrability implies uniform boundedness* in $\mathcal{L}^{1}$: if $\mathcal{C}$ is uniformly integrable, then there is a bound $L$ where $\mathbb{E}[| X |] \le L$ for any $X \in \mathcal{C}$. > [!proof]- > For any $X \in \mathcal{C}$, fix $\epsilon=1$ and corresponding $K$ from UI, then $\begin{align*} \mathbb{E}[| X |]&= \mathbb{E}[| X | \mathbf{1}_{| X | \ge K}]+\mathbb{E}[| X | \mathbf{1}_{| X | < K}]\\ &\le \epsilon + K. \end{align*}$So we can take $L := \epsilon + K$. However, the converse is not true: for example the family $X_{n}:= n \mathbf{1}_{[0,1 / n]}$ in $([0,1], \mathcal{B}_{[0,1]}, \mathrm{Leb})$ is uniformly bounded in $\mathcal{L}^{1}$ by $1$, but is not uniformly integrable. *Nevertheless, uniform boundedness in $\mathcal{L}^{p> 1}$ implies UI*: > [!proof]- > Suppose $| X |$ is bounded by $L$ for all $X \in \mathcal{C}$. > Consider the inequality $v^{1-p} \le K^{1-p} \iff v \le K^{1-p}v^{p}$ when $v \ge K, p > 1$, and apply it to $v=| X |$ to get $\mathbb{E}[| X |\mathbf{1}_{| X | \ge K}] \le \mathbb{E}[K^{1-p}| X |^{p} \mathbf{1}_{| X | \ge K}] \le K^{1-p}L^{p}, $so there is $K$ large enough (say $K \ge (L^{-p}\epsilon)^{1 / (1-p)}$) for UI. ## Convergence Laws > [!theorem|*] Strong Law of Large Numbers > The **Strong Law of Large Numbers** states that given random variables $X_{1,\cdots, n}$ i.i.d. with mean $\mu$, then their average $\bar{X}=\frac{\sum X}{n}\to\mu\text{ a.s.}.$ > [!theorem|*] Central Limit Theorem > The **Central Limit Theorem** states that under the same conditions as above, with $\sigma^2$ being the variance, there is: $\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}=\frac{\sum_{i} X_{i}-n\mu}{\sigma\sqrt{n}}\xrightarrow{d}N(0,1).$ - So roughly, $M_n \approx N(\mu, \frac{\sigma}{\sqrt{n}})$.