> [!tldr]
> **Link functions** $g$ are bijective (generally non-linear) functions that model relationship between the expected response $\mu_{Y}$ and some predictors. They appear in [[Generalized Linear Models|GLMs]] and GAMs.
>
> For example, in generalized linear models, $g(\mu_{Y}(X))=\alpha + \sum_{i}\beta_{i}X_{i}$where $g$ adds a layer of non-linearity between the response and a linear model.
### Canonical Link Function
In GLMs, the choice of link functions is not entirely arbitrary: the mean $\mu$ of an distribution $f(\,\cdot\,;\theta)$ from an [[Exponential Dispersion Family|exponential dispersion family]] is the derivative of the function $\kappa$ (see definition of the families):
$\mu=\kappa'(\theta).$
Choosing the link function $g:=(\kappa')^{-1}$ allows the GLM to directly model the parameter $\theta$:
$\beta^{T} X =g(\hat{\mu})=(\kappa'^{-1}\circ\kappa')(\hat{\theta})=\hat{\theta}.$
This is the **canonical link function** of an EDF.
They are often used because they bring suitable statistical properties, rather than some a priori property that will bring linearity.
- For example, if $Y$ follows an EDF, and $V$ is its variance function $V(\mu)=\frac{\mathrm{Var}(Y)}{\phi}$, then the canonical link $g$ satisfies $g'=1 / V$, simplifying computations in estimation (e.g. in [[Iteratively Reweighted Least Squares|IRLS]]).
### Canonical Link Functions of Common Distributions
*Gaussians* use the identity function: the parameter $\theta=\mu$ is also the mean. By extension, the OLS with additive Gaussian errors is a simple case of GLiMs.
*Exponential/gamma distributions* have $\mu \mapsto \mu^{-1}$ as the canonical link.
*Poisson distributions* have $\mu \mapsto \log(\mu)$.
*Bernoulli and binomial distributions* use the canonical link $\mu \mapsto \log\left( \frac{\mu}{1-\mu} \right)$ (the **logit**, aka **log-odds** if $n=1$ and $\mu=p$ as in a Bernoulli distribution).
- Hence GLM in this case is also called the **logistic regression**.