Exponential Families Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA Surprisingly many of the distributions we use in statistics for random variables taking value in some space often R or N 0 but sometimes R n, Z, or some other space, indexed by a parameter θ from some parameter set Θ, can be written in exponential family form, with pdf or pmf fx θ = exp [ηθtx Bθ] hx for some statistic t : R, natural parameter η : Θ R, and functions B : Θ R and h : R +. The likelihood function for a random sample of size n from the exponential family is n f n x θ = exp ηθ tx j nbθ hx i, j=1 which is actually of the same form with the same natural parameter η, but now with statistic T n x = tx j and functions B n θ = nbθ and h n x = Πhx j. Examples For example, the pmf for the binomial distribution Bim,p can be written as [ ] m p x 1 p m x p m = exp log x m log1 p x 1 p x and natural sufficient statis- of Exponential Family form with ηp = log p tic tx = x, and the Poisson 1 p θ x x! e θ = exp [log θx θ] 1 1 x!
with η = log θ and again tx = x. The Beta distribution Beα,β with either one of its two parmeters unknown can be written in EF form too: [ ] Γα + β ΓαΓβ xα 1 1 x β 1 Γα 1 x β = exp α log x log Γα + β x1 xγβ [ ] Γβ x α = exp β log1 x log Γα + β x1 xγα with tx = log x or log1 x when η = α or η = β is unknown, respectively. With both parameters unknown the beta distribution can be written as a bivariate Exponential Family with parameter θ = α,β R +: fx θ = exp [ηθ tx Bθ]hx 1 with vector parameter η = α,β and statistic tx = log x,log 1 x and scalar one-dimensional functions Bθ = log Γα + log Γβ logα + β and hx = 1/x1 x. Since this comes up often, we ll let η and T be q-dimensional below; usually in this course q = 1 or. Natural Exponential Families It is often convenient to reparametrize exponential families to the natural parameter η = ηθ R q, leading with Aηθ Bθ to Since any pdf integrates to unity we have e Aη = e η tx hxdx fx η = e η tx Aη hx and hence can calculate the moment generating function MGF for the natural sufficient statistic tx = {t 1 x,,t q x} as M t s = E [e s t] = e s tx e η tx Aη hxdx = e Aη e η+s tx hxdx = e Aη+s Aη,
so log M t s = Aη + s Aη and we can find moments for the natural sufficient statistic by E[t] = log M t 0 = Aη V[t] = log M t 0 = Aη provided that η is an interior point of the natural parameter space E {η R q : 0 < e η tx hxdx < } and that A is twice-differentiable near η. For samples of size n N the sufficient statistic T n x = tx j is a sum of independent random variables, so by the Central Limit Theorem we have approximately No n Aη, n Aη. Note that Aη = log fx θ is both the observed and Fisher expected information matrix I n θ for natural exponential families, and that the score statistic is Z := log fx θ = [ T n x n Aη ]. Conjugate Priors For hyper-parameters α R q and β R such that c α,β := e ηθ α βbθ dθ <, we can define a prior density for θ by πθ α,β = c 1 α,β e ηθ α βbθ dθ. Θ Θ With this prior and with data { i } iid fx θ from the exponential family, the posterior is πθ x e ηθ α βbθ e ηθ Tnx nbθ πθ α = α + T n x, β = β + n, 3
again within the same family but now with parameters α = α + T n and β = β + n. For example, in the binomial example above this conjugate prior family is { } p πθ α,β exp αlog β log1 p = p α 1 p β α, 1 p the Beta family, while for the Poisson example it is πθ α,β exp {α log θ βθ} = θ α e βθ, the Gamma family. Conjugate families for every exponential family are available in the same way. Note not every distribution we consider is from an exponential family. From, for exmple, it is clear set of points where the pdf or pmf is nonzero, the possible values a random variable can take, is just {x : fx θ > 0} = {x : hx > 0}, which does not depend on the parameter θ; thus any family of distributions where the support depends on the parameter uniform distributions are important examples can t be from an exponential family. The next pages show several familiar and some less familiar ones, like the Inverse Gaussian IGµ, λ and Pareto Paα, β distributions in exponential family form. Some of the formulas involve the log gamma function γz = log Γz and its first and second derivatives, the digamma ψz = d/dzγz and trigamma ψ z = d /dz γz, which are built into R, Mathematica, Maple, the gsl library in C, and such, but aren t on pocket calculators or most spreadsheets. In each case Aη is the Information matrix in the natural parametrization, Iθ in the usual parameterization. 4
1 Exponential Family Examples Beα,β fx = Γα+β ΓαΓβ xα 1 1 x β 1, x 0,1 T = log x,log 1 x Bα,β = γα + γβ γα + β η = α,β Aη = γη [ 1 + γη γη 1 ] + η [ ] ψη1 ψη Aη = 1 + η ψα ψα + β ψη ψη 1 + η ψβ ψα + β ψ Aη = η 1 c c c ψ c = ψ η c η 1 + η Bim,p fx = m x p x q m x, x = 0...m T = x Bp = m log q η = logp/q Aη = m log1 + e η p = e η /1 + e η Aη = meη 1+e m p η Aη = Ip = m/pq m e η 1+e η Exλ fx = λe λx, x > 0 T = x Bλ = log λ η = λ Aη = log η Aη = 1/η 1/λ Aη = η Iλ = 1/λ Gaα,λ fx = λα Γα xα 1 e λx, x > 0 T = log x,x Bα,β = γα αlog λ η = α, λ Aη = γη [ 1 η 1 log η ] [ ] ψη1 log η Aη = ψα log λ η 1 /η α/λ ψ Aη = η 1 1/η ψ 1/η η 1 /η Iα,λ = α 1/λ 1/λ α/λ Gep fx = p q x, x = 0,1,,... T = x Bp = log p η = log q Aη = log1 e η p = 1 e η Aη = eη 1 e η q/p Aη = Ip = 1/p q e η 1 e η 5
Exponential Family Examples cont d IGa,b fx = ae a bx /x / πx 3, x > 0 T = 1/x,x Ba,b = ab log a η = a /, b / Aη = η 1 η 1 log η 1 a = η 1, b = η [ ] [ ] η /η Aη = η1 1 1/η 1 b/a + 1/a /η a/b Aη = 1 η η 1 3 + 1 η 1 1 η1 η 1 η1 η η1 η 3 Ia,b = b/a + /a 1 1 a/b NBα,p fx = α x p α q x, x = 0,1,,... T = x Bp = α log p η = log q Aη = α log1 e η p = 1 e η Aη = αeη 1 e η αq/p Aη = Ip = α/p q αe η 1 e η Noµ,σ fx = e x µ /σ / πσ T = x,x Bµ,σ = µ /σ + 1 log σ η = µσ, σ / Aη = η [ 1 /4η 1 log η ] [ η 1 /η µ ] Aη = Aη = η 1 /4η 1/η 1/η η 1 /η η 1 /η η 1 /η 3 + 1/η Ia,b = Poλ fx = λ x e λ /x!, x = 0,1,,... T = x Bλ = λ η = log λ Aη = e η λ = e η Aη = e η λ Aη = e η Iλ = 1/λ µ + σ σ 0 0 σ 4 / Paα,β fx = β α β /x β+1, x > α T = log x Bβ = log β β log α η = β Aη = log η + η log α Aη = log α 1/η log α + 1/β Aη = η Iλ = β 6