T Distribution - Random Notes Go Brrrrrrr

> [!tldr] T Distribution > > The t-distribution is a bell-shaped distribution with a heavier tail than the normal. The t-statistic follows the t-distribution. To estimate the mean of $N(\mu,\sigma^{2})$ with known variance, normalizing the sample mean with $\sigma$ gives the **z-statistic**: $Z=\frac{\bar{X}-\mu}{\sigma / \sqrt{ n }}$But if $\sigma^{2}$ is unknown, we need a statistic involving $\mu$ but not $\sigma^{2}$ to derive a confidence interval of $\mu$. Normalizing with the sample variance $S$ instead of $\sigma$ gives the **t-statistic**: $T=\frac{\bar{X}-\mu }{S / \sqrt{ n }} , \text{ where }S^{2}=\frac{\sum_{i}(X_{i}-\bar{X})^2}{n-1}$Note that $S^{2}$ looks proportional to some $\chi^{2}$ distribution, and $(\bar{X}-\mu) \sim N\left(0,\frac{\sigma^{2}}{n} \right)$. What's the distribution of their ratio? > [!definition|*] T-distribution > Independent random variables $Z \sim N(0,1)$ and $Y \sim \chi^{2}_{r}$ produce the **(Student) t-distribution** with $r$ degrees of freedom: $\frac{Z}{\sqrt{ Y / r }} \sim t_{r}$and we will see that *the t-statistic has the t-distribution*. > > The pdf's of t-distributions with $\nu$ degrees of freedom are: > > ![[TPDF.png#invert|w60|center]] > [!theorem|*] T-statistic follows T-distribution > The t-statistic constructed above follows the distribution $t_{n-1}$. In particular, > - The variance estimate is $S^{2}\sim \frac{\sigma^{2}}{n-1}\chi^{2}_{n-1}$, > - The sample mean is $\bar{X}\sim N(\mu, \sigma^{2} / n)$, > - The two are independent. > > > [!proof] Proof: > > It is the corollary of [[Inference in OLS#^038a42|this theorem from OLS theory]], as sample variance is just (proportional to) the RSS from fitting the sample with a constant (the intercept). As the degree of freedom $\to \infty$, the t-distribution is asymptotically normal. > [!proof]- Sketch proof > As $n \to \infty$, by law of large numbers, if $Y_{n} \sim \chi^{2}_{n}$, $\frac{Y_{n}}{n} = \frac{1}{n}\sum_{i=1}^{n}X_{i}^{2} \to \mathbb{E}[X_{i}^{2}]=\mathrm{Var}(X_{i})-\mathbb{E}[X_{i}]^{2}=1\,\,\rm{a.s.}$Hence $t_{n} \sim \frac{Z}{\sqrt{ Y_{n} / n }} \overset{D}{\approx}N(0,1)$. > [!examples]- Confidence interval with t-distribution > The t-statistic $T$ of $X_{1,\dots,n} \overset{iid.}{\sim} N(\mu,\sigma^{2})$ has $t_{n-1}$ distribution, so: $\begin{align*} &\mathbb{P}\left[ T= \frac{\bar{X}-\mu }{S / \sqrt{ n }} \in (0\pm\tau_{\alpha}) \right]\\ &= \mathbb{P}\left[ \mu \in \left( \bar{X}\pm\frac{S}{\sqrt{ n }}\tau_{\alpha} \right) \right]\\ &= 1-\alpha \end{align*}$gives the $(1-\alpha)$ CI of the mean $\mu$, which is independent of $\theta$. Here $\tau_{\alpha}$ is the upper $\frac{\alpha}{2}$ percentile of the t-distribution.