### Topics
![[NotesWithDomain.base]]
### Snippets
![[NotesWithDomain.base]]
## Measure Spaces
> [!definition|*] Algebras
>
> On a set $\Omega$, $\mathcal{F} \subseteq \mathcal{P}(\Omega)$ is an **algebra** if:
> - $\Omega \in\mathcal{F}$,
> - $\mathcal{F}$ is stable under complements and finite unions.
>
> It is a **σ-algebra** if it is also stable under countable unions.
Given a measurable space $(\Omega , \mathcal{F})$, they form a **measure space** when equipped with a [[Set Functions|σ-additive set function]] $\mu: \mathcal{F} \to [0,\infty]$, and in this context $\mu$ is a **[[Measures|measure]]**.
### Measure Extension from Algebra and $\pi$-Systems
$\sigma$-algebras are usually hard to characterise directly, so [[π-Systems|π-systems]] and algebras provide workable interfaces when proving results for $\sigma$-algebras.
To do so, we need to justify that:
- Uniqueness: that a characterization on a $\pi$-system is enough to uniquely determine the extension, if there is one.
- Existence: that such an extension actually exists.
> [!theorem|*] Uniqueness of extension from $\pi$-systems
> If *finite* measures $\mu_{0},\mu_{1}$ on $(\Omega,\mathcal{F})$ have the same total mass $\mu_{0}(\Omega)=\mu_{1}(\Omega) < \infty$ and agree on a generating $\pi$-system $\mathcal{D} \subseteq \mathcal{F}$ (so $\sigma(\mathcal{D})=\mathcal{F}$), then $\mu_{0},\mu_{1}$ must agree on $\mathcal{F}$.
> - Therefore, if a $\sigma$-additive set function $\mu|_{\mathcal{A}}$ has a finite extension to $\sigma(\mathcal{A})$, this extension must be unique.
>
> > [!warning] Counterexample for infinite measures
> > The Lebesgue measure and counting measure agree on the set $\{ (-\infty,a) ~|~ a \in \mathbb{R} \}$, but they disagree on the Borel sets of $\mathbb{R}$.
>
> > [!proof]-
> > Consider the sets $\mathcal{E}:=\{ A \in \mathcal{F} ~|~\mu_{0}(A)=\mu_{1}(A) \}$ on which the measures agree. Obviously $\mathcal{D} \subset \mathcal{E}$, so if we can show $\mathcal{E}$ is a $\lambda$-system, we can apply $\pi$-$\lambda$ system lemma to conclude that $\mathcal{E}=\sigma(\mathcal{D})=\mathcal{F}$.
> >
> > Now check the definition of a $\lambda$-system:
> > - By assumption $\Omega \in \mathcal{E}$.
> > - Set differences $B-A$ (for $A,B \in \mathcal{E}$) are also in $\mathcal{E}$ by additivity.
> > - Countable rising unions $\cup_{i}A_{i}$ where $A_{i} \subseteq A_{i+1}$: consider the differences $B_{i}:= A_{i+1}-A_{i} \in \mathcal{E}$. $\mu_{0,1}$ agree on those, so countable additivity gives $\mu_{0}(\cup_{i}B_{i})=\sum_{i}\mu_{0}(B_{i})=\sum_{i}\mu_{1}(B_{i})=\mu_{1}(\cup_{i}B_{i}),$and $\cup_{i}B_{i}=\cup_{i}A_{i}$.
Counter example without finiteness assumption: $\mu_{1}\equiv \infty$ and $\mu_{2}=\mathrm{Leb}$ agree on the $\pi$-system $\{ (-\infty, a) ~|~a \in \mathbb{R} \},$but they don't agree on the Borel sets $\mathcal{B}(\mathbb{R})$ generated by that system.
^33cbad
> [!theorem|*] Caratheodory's Theorem: existence of extensions from algebras
>
> On a set $\Omega$, a $\sigma$-additive set function $\mu_{0}$ on an *algebra* $\mathcal{F}_{0}$ can be extended to a measure $\mu$ on $\sigma(\mathcal{F}_{0})$.
> - As a corollary, if $\mu_{0}(\Omega) < \infty$, then the extension $\mu$ is unique.
## Measurability
A function $f:(\Omega, \mathcal{F}) \to (E,\mathcal{E})$ is $\mathcal{F}$-**measurable** if $\forall B \in \mathcal{E},~ f^{-1}(B) \in \mathcal{F}$That is, all of $\mathcal{E}$ has a preimage in $\mathcal{F}$.
The $\sigma$**-algebra generated by a family of functions** $\{ f_{i}:\mathcal{F} \to \mathcal{E}~|~ i \in I \}$ is denoted $\sigma(f_{i} ~|~ i \in I)$, the smallest $\sigma$-algebra in $\mathcal{F}$ where all of $f_{i}$ are measurable. In particular, $\sigma(f_{i})=\sigma(\{ f_{i}^{-1}(B)~|~ i \in I,B \in \mathcal{E} \})$which is the $\sigma$-algebra generated by preimages from $\mathcal{E}$.
### Criteria for Measurability
A function $f: \mathcal{F} \to \mathcal{E}$ is $\mathcal{F}$-measurable if a generating subset $\mathcal{C} \subseteq \mathcal{E} \text{ s.t. }\sigma(\mathcal{C})=\mathcal{E}$ has a preimage in $\mathcal{F}$: $\forall B \in \mathcal{C},~ f^{-1}(B) \in \mathcal{F}$
> [!proof]-
> The set $\mathcal{D}:=\{ B \in \mathcal{E}~|~ f^{-1}(B) \in \mathcal{F} \}$is a $\sigma$-algebra, as preimages preserve set operations. Then as $\mathcal{C} \subset \mathcal{D}$, we have $\mathcal{E} = \sigma(\mathcal{C}) \subseteq \mathcal{D}$. Double inclusion follows from definition of $\mathcal{D}$.
- As a corollary, a function $f:(\Omega, \mathcal{F}) \to (\mathbb{R}, \mathcal{B}(\mathbb{R}))$ is measurable if $\forall x \in \mathbb{R}, f^{-1}(-\infty, x) \in \mathcal{F}$.
If a sequence of functions $(f_{n})$ are all measurable, then the following are measurable as well: $\sup_{n \in \mathbb{N}}f_{n},~\inf_{n \in \mathbb{N}} f_{n},~ \limsup_{n \to \infty} f_{n},~ \liminf_{n \to \infty} f_{n} $
### Random Variables
In a probabilistic context, measurable functions $f:(\Omega, \mathcal{F}) \to (\mathbb{R},\mathcal{B}(\mathbb{R}))$ are called **random variables**.
- For example, coin-tossing can be modeled with $\Omega=\{ H,T \}^\mathbb{N}$, and $\mathcal{F}=\sigma(\{ \underset{\text{all sequences such that the $n$th toss is a $C$}}{\omega:\omega_{n}=C \}~|~ C=H\text{ or }T, n \in \mathbb{N}})$and taking the first toss $X: \omega \mapsto \omega_{1}$ is a random variable.
If a random variable $X$ is on $(\Omega, \mathcal{F}, \mathbb{P}) \to (\mathbb{R}, \mathcal{B}(\mathbb{R}))$, its **law** or **image measure** is $\Lambda_{X}: \mathcal{B}(\mathbb{R}) \to \mathbb{R},~ B \mapsto \mathbb{P}[X^{-1}(B)]= \mathbb{P}[ X \in B]$which is a probability measure on $(\mathbb{R}, \mathcal{B}(\mathbb{R}))$.
The law $\Lambda_{X}$ defines the **distribution function** $F_{X}$ of $X$: $F_{X}: \mathbb{R} \to \mathbb{R}, ~ c \mapsto \Lambda_{X}(-\infty, c]=\mathbb{P}[X \le c].$
- Distributions have the usual properties expected: non-decreasing, goes from $0$ to $1$, and is right-continuous (which follows from continuity of measures, in this case $\mathbb{P}$, from above).
Conversely, given a function $F$ with the three properties of a distribution function, it's possible to define variables $X_{\pm}:(\mathbb{R}, \mathcal{B}(\mathbb{R}),\mathrm{Leb}) \to \mathbb{R}$ that have this distribution: $\begin{align*}
X_{+}(w):&= \sup \{ x \in \mathbb{R} ~|~ F(x) \le w \}\\
X_{-}(w):&= \sup \{ x \in \mathbb{R} ~|~ F(x) \lneq w \}
\end{align*}$
- Note that $X_{+}$ can also be defined as $\inf \{ x ~|~ F(x) > w \}=:F^{-1}$, the **quantile** of $F$.
> [!info]- Deriving the distribution of $X_{-}$
> $X_{-}$ has distribution $F$ since $\begin{align*}
w &\in X_{-}^{-1}(-\infty, a]\\
&\iff X_{-}(w) \le a\\&\iff w\le F(a)
\end{align*}.$The last equivalence follows from:
> - $w \le F(a) \Rightarrow X_{-}(w) \le a$ since $F$ is non-decreasing,
> - For the other direction, note that $z > X_{-}(w)$ implies $F(z) \ge w$ by definition of $X_{-}$, and letting $z \downarrow X_{-}(w)$ gives $F(X_{-}(w)) \ge w$ by right continuity. Now $F(a) \ge F(X_{-}(w)) \ge w$ gives the other direction.
>
> Hence $X_{-}^{-1}(-\infty,a]=[0,F(a)]$, and $\mathbb{P}[X_{-} \le a]=\mathrm{Leb}[0,F(a)]=F(a)$.
> [!math|{"type":"lemma","number":"","setAsNoteMathLink":false,"title":"$\\sigma$-algebra of Random Variables","label":"sigma-algebra-of-random-variables"}] Lemma ($\sigma$-algebra of Random Variables).
> If $X_{1},\dots, X_{k}$ are random variables $(\Omega, \mathcal{F},\mathbb{P}) \to (\mathbb{R},\mathcal{B}(\mathbb{R}))$, then their $\sigma$-algebra is generated by the $\pi$-system $\begin{align*}
\mathcal{A}&:= \{ \{ X_{1}\le x_{1},\dots,X_{n} \le x_{n} \}\,|\, x_{1},\dots,x_{n} \in \bar{\mathbb{R}} \}\\
&= \left\{ \bigcap_{i=1}^{n}X_{i}^{-1}(\infty, x_{i}] \,|\, x_{1},\dots,x_{n} \in \bar{\mathbb{R}} \right\}
\end{align*}$
## Independence
> [!definition|*] Independence of $\sigma$-algebras
> $\sigma$-algebras $\mathcal{G}_{1},\dots,\subseteq \mathcal{F}$ are **independent** if for any finite number of events $E_{i_{1}},\dots,E_{i_{n}}$ drawn from distinct $\mathcal{G}_{i_{1}},\dots,\mathcal{G}_{i_{n}}$, there is $\mathbb{P}[E_{i_{1}} \cap \dots \cap E_{i_{n}}]= \prod_{j=1}^{n}\mathbb{P}[E_{i_{j}}].$
- Hence an infinite collection $(\mathcal{G}_{i})$ are independent if and only if all its finite sub-collections are independent.
Random variables $X_{1},\dots$ are independent if $\sigma(X_{1}),\dots$ are independent. Events $E_{1},\dots$ are independent if $\mathcal{E_{1}},\dots$ are independent, where $\mathcal{E}_{i}:=\{ \emptyset, E_{i}, E_{i}^{c}, \Omega \}$.
For simplicity, for $\mathcal{S}_{1},\dots \subseteq \mathcal{F}$, write $\mathrm{Indep}\{ \mathcal{S}_{1},\dots \}$ if the collection satisfy the independence requirement (even if they are not $\sigma$-algebras), that is $\forall S_{i_{1}} \in \mathcal{S}_{i_{1}},\dots, S_{i_{n}} \in \mathcal{S_{i_{n}}},\,\, \mathbb{P}\left[ \bigcap_{j=1}^{n}S_{i_{j}} \right]=\prod_{j=1}^{n}\mathbb{P}[S_{i_{j}}]$*This is not standard notation.*
### Deducing Independence
> [!theorem|*] Independence from Generating $\pi$-systems
>
>
Independence of $\sigma$-algebras can be extended from independence between generating $\pi$-systems:
> - If $\mathcal{G},\mathcal{H} \subseteq \mathcal{F}$ are $\sigma$-algebras, and $\mathcal{I},\mathcal{J}$ are $\pi$-systems generating them, then $\mathrm{Indep}\{ \mathcal{G}, \mathcal{H} \} \iff \mathrm{Indep}\{ \mathcal{I}, \mathcal{J} \}.$
>
> - Similarly, if $\mathcal{G}_{1},\dots,\mathcal{G_{n}}$ are generated by $\pi$-systems $\mathcal{A}_{1},\dots,\mathcal{A}_{n}$, then $\mathrm{Indep}\{ \mathcal{G}_{1}, \dots,\mathcal{G}_{n} \} \iff \mathrm{Indep}\{ \mathcal{A}_{1},\dots, \mathcal{A}_{n} \}.$
>
> > [!proof]-
> > $(\Rightarrow)$ is trivial; $(\Leftarrow)$ is as follows.
> > Fix $A_{2}\in \mathcal{A}_{2},\dots,A_{n} \in \mathcal{A}_{n}$. Then the two measures on $\mathcal{G}_{1}$ $\begin{align*}
G &\mapsto \mathbb{P}[G \cap A_{2} \cap\dots \cap A_{n}] ~\text{ and }\\
G &\mapsto \mathbb{P}[G] \cdot \mathbb{P}[A_{2}]\cdots \mathbb{P}[A_{n}]
\end{align*}$agree on $\mathcal{A}_{1}$ and have the same total mass $\mathbb{P}[A_{2} \cap \dots \cap A_{n}] < \infty$, so they extend to the same measure on $\mathcal{G}_{1}$, and we have $\mathrm{Indep}\{ \mathcal{G}_{1}, \mathcal{A}_{2},\dots,\mathcal{A}_{n} \}$.
> > - Alternatively, Consider the set $\mathcal{M}_{1} \subseteq \mathcal{G}_{1}$ on which $\mathbb{P}[\ast \cap A_{2} \cap\dots \cap A_{n}] = \mathbb{P}[\ast] \cdot \mathbb{P}[A_{2}]\cdots \mathbb{P}[A_{n}]$ holds. Now show it is a $\lambda$-system, so by $\pi$-$\lambda$ system lemma, $\mathcal{M}_{1}=\mathcal{G}_{1}$.
> >
> > Now do the same for $\mathcal{A}_{2}$ by fixing $G_{1} \in \mathcal{G}_{1},A_{3} \in \mathcal{A}_{3},\dots, A_{n} \in \mathcal{A}_{n}$. It gives $\mathrm{Indep}\{ \mathcal{G}_{1},\mathcal{G}_{2},\mathcal{A}_{3},\dots,\mathcal{A}_{n} \}$.
> >
> > Continuing the induction will give $\mathrm{Indep}\{ \mathcal{G}_{1}, \dots,\mathcal{G}_{n} \}$ as desired.
An important corollary is that *random variables $X_{1},\dots,X_{n}:(\Omega, \mathcal{F}, \mathbb{P}) \to (\mathbb{R}, \mathcal{B}(\mathbb{R}))$ are independent if their joint distribution function can be factored into individual ones*, i.e. for any $x_{1},\dots, x_{n} \in \mathbb{R}$, there is $\mathbb{P}[X_{1}\le x_{1};\dots; X_{n} \le x_{n}]=\prod_{i=1}^{n}\mathbb{P}[X_{i} \le x_{i}].$
> [!proof]-
> Essentially, it is because the events $\{ X_{i} \le x \}$ are the preimages $\mathcal{A}_{i}:= \{ X_{i}^{-1}(-\infty, x] ~|~ x \in \mathbb{R} \}$, and $\{ (-\infty, x] ~|~ x \in \mathbb{R}\}$ is a $\pi$-system generating $\mathcal{B}(\mathbb{R})$.
>
> Now $\sigma(X_{i})=\sigma(\mathcal{A}_{i})$, so assuming independence between the latter (as in the theorem) is sufficient for independence between the former.
Furthermore, functions $Y_{i}=f_{i}(X_{i})$ of independent RVs $X_{i}$ are also independent (assuming $f_{i}$ are all measurable): $\sigma(Y_{i})\subseteq \sigma(X_{i}),$so independence between the latter implies that of the former.
### Tail Events
Suppose we want to study the long-term behavior of a sequence of variables $(X_{n})_{n \ge 1}$. A $\sigma$-algebra containing information about such behavior should allow measurability of the tail, while the "head" (i.e. first finitely many variables) do not matter:
> [!definition|*] Tail Algebra
>
> Define$\mathcal{T}_{n}:=\sigma(X_{n+1},X_{n+2},\dots).$
> The **tail σ-algebra** of a sequence $(X_{n})$ is then $\mathcal{T}=\cap_{n}\mathcal{T}_{n}.$Events in this algebra are called **tail events**.
> [!examples] Examples (and counterexamples) of tail events
> As a rule of thumb, tail events should not care about what happens in the first $k$ terms, where $k <\infty$ can be however large. Suppose $(X_{n})$ is a sequence of random variables, then
> - The event $\left[(X_{n}) \text{ converges} \right]$ is a tail event.
> - So is the event $\left[ \sum_{n} X_{n} \text{ converges} \right]$.
>
> But if the event "remembers" what happens in the head, it is not a tail event: if $S_{n}:=\sum_{i=1}^{n}X_{n}$,
> - $[\limsup_{n}S_{n}>0]$ is not a tail event, since the value $X_{1}$ will always be "remembered" in $S_{n}$.
> - However, $\left[ \frac{S_{n}}{n} \to 0\right]$ is a tail event, as the impact of earlier terms become negligible when $n \to \infty$.
> [!theorem|*] Kolomogorov's 0-1 Rule
>
> If $(X_{n})_{n \ge 1}$ are independent, and they have tail $\sigma$-algebra $\mathcal{T}$, then:
> - Any event $E \in \mathcal{T}$ has $\mathbb{P}[E]\in \{ 0,1 \}$.
> - Any $\mathcal{T}$-measurable function/RV $f$ is $\mathrm{a.s.}$ constant.
>
> Hence if we can show $E \in \mathcal{T}$ has $\mathbb{P}[E] >0$, then $E$ must happen $\mathrm{a.s.}$
> > [!proof]-
> > The key to the proof is to show that $\mathcal{T}$ is independent of itself: then any $E \in \mathcal{T}$ has $\mathbb{P}[E]=\mathbb{P}[E \cap E]=\mathbb{P}[E]^{2}$, giving the result.
> >
> > Define the "head" $\sigma$-algebra $\mathcal{H}_{k}:=\sigma(X_{1},\dots,X_{k})$.
> > Then $\forall k \ge 0,~\mathrm{Indep}\{ \mathcal{T}_{k}, \mathcal{H}_{k} \}$ since they are generated by disjoint (hence independent) sets of $(X_{n})$.
> > Now $\mathcal{T} \subseteq \mathcal{T}_{k}$, so $\begin{align*}
\forall k \ge 0,~&\mathrm{Indep}\{ \mathcal{T}, \mathcal{H}_{k} \}\\[0.2em]
\Longrightarrow ~ ~&\mathrm{Indep}\left\{ \mathcal{T}, \bigcup_{k}\mathcal{H_{k}} \right\}
\end{align*}$But $\cup_{k} \mathcal{H}_{k}$ is a $\pi$-system, and $\mathcal{T} \subset \sigma(X_{1},X_{2},\dots) = \sigma(\cup_{k}\mathcal{H}_{k})$, so $\mathcal{T}$ is independent of $\sigma(\cup_{k}\mathcal{H}_{k})$, and hence itself.
>> ---
> > For any function $f$ that is $\mathcal{T}$-measurable, consider its distribution $\mathbb{P}[f \le z]$.
> > Since $\{ f \le z \} \in \mathcal{T}$ by assumption, it happens with probability $0$ or $1$. Therefore consider $f^{\ast}:= \inf \{ z ~|~ \mathbb{P}[f \le z]=1 \},$and it must be the $\mathrm{a.s.}$ value of $f$ because $\begin{cases}
y < z &\Rightarrow &\mathbb{P}[f \le y] = 0; \\[0.4em]
y > z &\Rightarrow &\mathbb{P}[f \ge y] \le \mathbb{P}[f \gneq z] = 0.
\end{cases}$
- Therefore, $\lim_{n \to \infty}S_{n} / n$ from the examples is $\mathcal{T}$-measurable, so it must be $\mathrm{a.s.}$ constant.
### Borel-Cantelli Lemmas
^64b057
> [!tldr]
> Borel-Cantelli lemmas decide whether events will keep occurring, or will die out eventually.
> [!definition|*] Infinitely Often, Eventually
> Let $(A_{n})_{n \ge 1}$ be a sequence of events.
> $\begin{align*} [A_{n}~\mathrm{i.o.}] &:= \overset{(\text{for all }n,\text{ there is }m)}{\bigcap_{n=1}^{\infty}\bigcup_{m \ge n} A_{m}} =: \limsup_{n} A_{n} \\[0.2em] [A_{n} ~\mathrm{eventually}] &:= \underset{(\text{for some }n,\text{ all }m \ge n)}{\bigcup_{n=1}^{\infty}\bigcap_{m \ge n} A_{m}} =: \liminf_{n} A_{n}.
\end{align*}$
> Note that $[A_{n}\,\mathrm{i.o.}]=[A_{n}^{c}\,\text{eventually}]^{c}$.
> [!theorem|*] First Borel-Cantelli Lemma
> If $(A_{n})_{n \ge 1}$ have $\sum_{n}\mathbb{P}[A_{n}] < \infty,$ then they almost never occur infinitely often. That is, $\mathbb{P}[A_{n}~\mathrm{i.o.}]=0.$This makes no assumption about the independence of $A_{n}$.
> > [!proof]-
> > Define $f_{n}:= \mathbf{1}_{A_{n}}$, then by the MCT, $\mathbb{E}\left[ \sum_{n}f_{n} \right]=\sum_{n}\mathbb{E}[f_{n}]=\sum_{n}\mathbb{P}[A_{n}] < \infty.$
> > This forces the function/variable $\sum_{n}f_{n}$ to be $\mathrm{a.s.}$ finite, i.e. $A_{n} ~\mathrm{i.o.}$ is almost impossible.
> [!theorem|*] Second Borel-Cantelli Lemma
> If $(A_{n})_{n \ge 1}$ are independent and have $\sum_{n}\mathbb{P}[A_{n}] =\infty$, then they will occur infinitely often almost surely. That is, $\mathbb{P}[A_{n}\,\mathrm{i.o.}]=1.$
> > [!warning]- Independence is necessary
> > Since for example if $A_{n}$ are all the same event $[\text{it rains today}]$, (assuming it occurs with $p \in (0,1)$), then $\sum_{n}\mathbb{P}[A_{n}]=\sum_{n}p \to \infty$, but obviously $\mathbb{P}[A_{n}\,\mathrm{i.o.}]=p < 1$.
>
>
> > [!proof]-
> > Denote $p_{n}:= \mathbb{P}[A_{n}],$ then $\begin{align*}
\mathbb{P}[A_{n}^{c}\, \text{eventually}]&= \mathbb{P}\left[ \bigcup_{n} \bigcap_{m \ge n}A_{m}^{c} \right]\\
&\le \sum_{n}\mathbb{P}\left[ \bigcap_{m \ge n}A_{m}^{c} \right],
\end{align*}$But each term in the sum is $\begin{align*}
\mathbb{P}\left[ \bigcap_{m \ge n}A_{m}^{c} \right]&= \prod_{m \ge n}(1-p_{m}) &\text{[independence]}\\
&\le \exp\left( -\sum_{m \ge n}p_{m} \right) &\text{[Taylor expansion]}\\
&= 0 & [ \text{since }\sum_{m}p_{m} \to \infty ]
\end{align*}$So $[A_{n}^{c}\,\text{eventually}]=[A_{n}\,\mathrm{i.o.}]^{c}$ is almost impossible, and so $[A_{n}\,\mathrm{i.o.}]$ is $\mathrm{a.s.}$
>
> > [!examples]- Monke!
> > If a monkey types one of the $26$ letters per second, independently and uniformly randomly, then the events $A_{k}:= \{ \text{``BRUH" is typed between times } 4k \text{ and } 4(k+3) \}$ (inclusive index starting at $0$) all happen independently with probability $26^{-4}$, so $\sum_{k}\mathbb{P}[A_{k}] = \infty \Longrightarrow (A_{k}) ~\mathrm{i.o.} \text{ is } \mathrm{a.s.}$
^db0190