Basic Concepts
When we don’t have any useful prior information, we prefer to use a prior that maximizes the influence of the data (i.e. the maximum likelihood of the data) on the posterior probability. In other words, we want the prior to have the minimum influence on the posterior. Such a prior is called a non-informative prior.
Often, it turns out that such a prior is is not really a probability function. This actually violates Bayes Theorem which only deals with probability functions. A probability function returns non-negative values and its values sum to one (as a discrete sum or via integration). A non-probability function, called an improper prior, can be used, however, provided the posterior turns out to be a probability function.
In particular, there is a class of useful non-informative priors called Jeffreys priors. We give some examples here, and more details, using calculus, can be found at Jeffreys’ Priors.
Jeffreys Priors
The Jeffreys’ prior for binomial data is f(p) = pdf for Bet(1/2, 1/2)
Also, the Jeffreys’ prior for normally distributed data with a known variance is f(μ) ∝ 1
The Jeffreys’ prior for normally distributed data with unknown mean and variance is f(μ, σ) ∝ σ-3
Theory
Click here to get more details about Jeffreys’ priors and how to derive the above priors.
References
Reich, B. J., Ghosh, S. K. (2019) Bayesian statistics methods. CRC Press
Lee, P. M. (2012) Bayesian statistics an introduction. 4th Ed. Wiley
https://www.wiley.com/en-us/Bayesian+Statistics%3A+An+Introduction%2C+4th+Edition-p-9781118332573
Jordan, M. (2010) Bayesian modeling and inference. Lecture 1. Course notes
https://people.eecs.berkeley.edu/~jordan/courses/260-spring10/lectures/lecture1.pdf
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., Rubin, D. B. (2014) Bayesian data analysis, 3rd Ed. CRC Press
https://statisticalsupportandresearch.files.wordpress.com/2017/11/bayesian_data_analysis.pdf