> [!tldr]
> **Uninformative priors** are prior distributions in [[Bayesian Inference]] that represent a lack of information.
In cases where we have no prior information, **uninformative/objective priors** can represent ignorance.
> [!examples]
> For example, we may choose $p \sim U[0,1]$ for the probability in $\text{Bernoulli}(p)$.
>
> Following the interpretations in the previous section, "no previous experiments" correspond to $B(1,1)=U[0,1]$.
However, for parameters without bounds, e.g. the mean $\mu$ in $N(\mu,\sigma^{2})$, an ignorant prior would be $U(\mathbb{R})$, which cannot be normalized to have mass $1$. In those cases, they are called **improper priors**.
- Improper priors can still produce proper posteriors, so they are not an issue per se.
Another issue is that *in general, reparametrization does not preserve uniformity*: given prior $\pi(\theta)$ and invertible transformation $\phi=\phi(\theta)$, its reparametrized distribution is $p(\phi)=\pi(\theta(\phi)) \left| \frac{d\theta}{d\phi}\right|$which is not necessarily uniform.
* For example, suppose $\phi \sim U[0,1]$ (left), then under a log-odds map $\phi \to \psi=\log(\phi / (1-\phi))$,
![[Logodds.png#invert]]
the log-odds (right) is bell-shaped, concentrated within $[-3,3]$. It's no longer "ignorant".
### Jefferys Prior
> [!definition|*] Jeffery's Prior
>
> Given a likelihood $L(X;\theta)$, its **Jefferys prior** is a prior that is resistant to reparametrization, defined by $\pi(\theta)\propto \sqrt{ I_{X}(\theta)}, $where $I_{X}(\theta)$ is the [[Information and Bounding Errors#Fisher and Observed Information|expected/Fisher information]] of $\theta$ from that likelihood.
> >[!info]- In higher dimensions
> >The Jefferys prior in higher dimensions is $\pi(\theta)\propto | I_{X}(\theta) |^{1/2}$, i.e. root of the determinant of the Fisher information matrix.
Jefferys priors are resistant in the sense that *it is always proportional to the root-information*: if $\pi(\theta)$ is a Jefferys prior, then reparametrizing with $\phi:=\phi(\theta)$ gives the new prior $p(\phi)=\pi(\theta(\phi))\left|\frac{ d \theta }{ d \phi } \right| \propto \sqrt{ I_{X}(\theta)\left( \frac{ d \theta }{ d \phi } \right)^{2} } =\sqrt{ I^{\ast}_{X}(\phi) }\,,$which is the Jefferys prior of $\phi$. Note that $I^{\ast}_{X}(\phi)$ is the Fisher information of $\phi$, not $\theta$.
Common Jefferys priors include:
- $\pi(\theta)\propto1$, where $\theta$ is a **location parameter**, e.g. the mean of $N(\theta,\sigma^{2})$, or the lower bound of $U[\theta,\theta+1]$.
- $\pi(\sigma)\propto \frac{1}{\sigma}$, where $\sigma$ is a **scale parameter**, e.g. the standard deviation $\sigma$ of $N(\theta,\sigma^{2})$.
### Maximum Entropy Prior
The **maximum entropy prior** has the largest **entropy** $\mathrm{Ent}[\pi]:=-\int_{\Theta} \pi \log \pi \, d\theta $among all choices (e.g. those that satisfy a certain constraint).
- The larger the entropy, the less information a distribution contains.
- For distributions with unbounded entropy, the maximum entropy prior need not exist.
> [!examples] Maximum entropy priors
> If a prior $\pi(\theta)$ is constrained to have $\mathbb{E}[\theta]=\mu_{\theta}$ and $\mathrm{Var}(\theta)=\sigma^{2}_{\theta}$, then the maximum entropy prior is $\theta \sim N(\mu_{\theta},\sigma^{2}_{\theta})$.
> [!theorem|*] Exponential Famility Priors have Maximum Entropy
> If there are functions $T_{1},\dots,T_{k}:\mathcal{X} \to \mathbb{R}$ that define the constraints $\forall i,~\mathbb{E}_{\theta}[T_{i}(\mathbf{X})]=t_{i}$for constants $t_{1},\dots,t_{k}$, then the entropy is uniquely maximized by the exponential family of priors $\{ \pi(\theta):= \exp[\mathbf{T}(\theta)\cdot \lambda-B(\lambda)] \}$where $\lambda$ is a (vector-valued) hyperparameter.