Deriving the Conditional PMF of the Pareto/NBD Model Peter S. Fader www.petefader.com Bruce G. S. Hardie www.brucehardie.com January 24 Abstract This note presents the derivation of an expression for the Pareto/NBD model conditional PMF, PXT, T t = x x, t x, T. Preliminaries Recall the basic Pareto/NBD model results Fader and Hardie 25: i The individual-level likelihood function for someone with purchase history x, t x, T is Lλ, µ x, t x, T = λ x e λµt λx µ λ µ e λµtx λx µ λ µ e λµt = λx µ λ µ e λµtx λx λ µ e λµt. 2 ii The likelihood function for a randomly chosen individual with purchase history x, t x, T is Lr, α, s, β x, t x, T = Γr xαr β s { s r x A A 2 3 Γr where A = 2F, s ; ; αt x α t x rsx if α β 2F, r x; ; βt x β t x rsx if α < β c 24 Peter S. Fader and Bruce G.S. Hardie. This document can be found at <http://brucehardie.com/notes/28/>.
and A 2 = 2F, s; ; αt α T rsx if α β 2F, r x ; ; βt β T rsx if α < β iii The joint posterior distribution of Λ and M is where gλ, µ r,α,s, β;x,t x, T = Lλ, µ x, t x, Tgλ r, αgµ s,β, 4 Lr, α, s, β x, t x, T gλ r,α = αr λ r e λα Γr is the gamma distribution that captures heterogeneity in transaction rates across customers, and gµ s, β = βs µ s e µβ 6 Γs is the gamma distribution that captures heterogeneity in death rates across customers. iv The probability that a customer with purchase history x, t x, T is alive at time T is the probability that the unobserved time at which he dies ω occurs after T. Conditional on λ and µ, this is PΩ > T λ, µ; x, t x, T = 5 λx e λµt Lλ, µ x,t x, T. 7 It follows from that the probability that a customer with purchase history x, t x, T is dead at time T is PΩ T λ, µ, x, t x, T = Removing the conditioning on λ and µ gives us λ x µ λ µ e λµtx λx µ λ µ e λµt Lλ, µ x,t x, T. 8 PΩ > T r, α, s,β; x,t x, T Γr xα r β s / = Γrα T rx β T s Lr, α, s, β x, t x, T. 9 As we proceed with the derivation, we will need evaluate a double integral of the following form several times: A = λ a µ b λ µ ce λµd gλ r,αgµ s,β dλdµ. Consider the transformation Y = M/Λ M and Z = Λ M. Noting that the Jacobian of this transformation is λ λ y z J = µ µ = z, y 2 z
it follows from the standard transformation of random variables method Casella and Berger 22, Section 4.3, pp. 56 62; Mood et al. 974, Section 6.2, p. 24ff that the joint distribution of Y and Z is gy, z α, β, r, s = We solve in the following manner: A = = αr β s ΓrΓs = αr β s ΓrΓs αr β s ΓrΓs ys y r z rs e zα y. y b y a z ab c e zd gy, z α, β, r,s dzdy which for r s a b > c y sb y ra z rsab c e zαd y dz dy y sb y ra { = αr β s Γr s a b c ΓrΓs α d rsab c z rsab c e zαd y dz dy y sb y ra [ αd y ] rsab c dz which, recalling Euler s integral for the Gaussian hypergeometric function, = αr β s Γr s a b c ΓrΓs α d rsab c Br a, s b 2 F r s a b c, s b; r s a b; αd. Looking closely at, we see that the argument z of the Gaussian hypergeometric function, αd, is guaranteed to be bounded between and when α β since β >, thus ensuring convergence of the series representation of the function. However, when α < β we can be faced with the situation where αd <, in which case the series is divergent. Applying the linear transformation Abramowitz and Stegun 972, equation 5.3.4 gives us A = 2F a, b; c; z = z a 2F a, c b; c; z z αr β s Γr s a b c ΓrΓs β d rsab c Br a, s b 2 F r s a b c, r a; r s a b; βd. 2 We note that the argument z of the above Gaussian hypergeometric function is bounded between and when α < β. We therefore present and 2 as solutions to : we use when α β and 2 when α < β. 2F a, b; c; z = R Bb,c b tb t c b zt a dt, c > b. 3
2 Derivation Suppose we know an individual s unobserved latent characteristics λ and µ. If the customer is alive at T i.e., ω > T, it follows from the derivation of PXt = x λ, µ Fader and Hardie 26 that When x >, PXT, T t = x λ, µ, ω > T = λtx e λµt x! PXT, T t = x λ, µ; x,t x, T λ x µ x [ e λµt λ µ x i= [ λ µt ] i = PXT, T t = x λ, µ, ω > TPΩ > T λ, µ; x,t x, T. i! ]. 3 This also holds when x =. However, in this second case we also have to account for the possibly that no purchases are observed in T, T t] because the customer died at or before T. Therefore, PXT, T t = x [ λ, µ, x, t x, T = δ x = PΩ > T λ, µ; x, tx, T ] PXT, T t = x λ, µ, ω > TPΩ > T λ, µ; x, t x, T. 4 Therefore, PXT, T t = x λ, µ; x, t x, T = δ x = { λx e λµt { t x Lλ, µ x,t x, T x! λxx e λµtt x i= t i i! Lλ, µ x, t x, T λxx µ λ µ x e λµt λ xx µ λ µ x i e λµtt. 5 We remove the conditioning on λ and µ by taking the expectation of this over the joint posterior distribution of Λ and M, 4, giving us PXT, T t = x { r, α, s, β;x,t x, T = δ x = PΩ > T r, α, s, β;x,tx, T Lr, α, s, β x, t x, T { t x x! B B 2 x i= t i i! B 3i 6 where B = λ xx e λµtt gλ r,αgµ s,β dλdµ { { = λ xx e λtt gλ r,αdλ e µtt gµ s, β dµ = Γr x x Γr α r β s α T t rxx β T t s 7 4
B 2 = B 3i = λ xx µ λ µ x e λµt gλ r,αgµ s, β dλdµ = αr β s ΓrΓs ΓBr x x, s 2F, s ; x ; αt α T rsx if α β 2F, r x x ; x ; βt β T rsx if α < β λ xx µ λ µ x i e λµtt gλ r, αgµ s,β dλdµ = αr β s ΓrΓs Γ ibr x x, s 2F i, s ; x ; α T t rsxi if α β 2F i, r x x ; x ; β T t rsxi if α < β 8 9 3 The Special Case of x = We can derive a simpler expression for the special case of x = i.e., no purchasing in the interval T, T t]. Conditional on λ and µ, there are three ways in which a customer with purchase history x, t x, T could make no purchases in the interval T, T t]: The customer is dead at T, which occurs with probability PΩ T λ, µ, x,t x, T, 8, or The customer is alive at T, which occurs with probability PΩ > T λ, µ, x,t x, T, 7, and he remains alive through the interval T, T t] with probability e µt and makes no purchases in that interval the Poisson probability of which is e λt, or he dies at time ω in the interval T, Tt] and makes no purchases in the interval T, ω]: Tt Combining terms and simplifying gives us T t e λω T µe µω T dω = µ e λµs ds = µ e λµt. λ µ PXT, T t = λ, µ; x,t x, T = Lλ, µ x, t x, T { λ x µ λ µ e λµtx λx λ µ e λµtt. 2 5
Taking the expectation of this over the joint posterior distribution of λ and µ, 4, it follows from 2 that where PXT, T t = r, α,s, β; x,t x, T = Γr xαr β s A 3 = { Lr, α, s, β x, t x, T Given 3, this can be rewritten as s Γr A r x A 3, 2 2F, s; ; α T t rsx if α β 2F, r x ; ; β T t rsx if α < β PXT, T t = r, α,s, β; x,t x, T = sa r xa 3 sa r xa 2. 22 Setting x to in 6 does not give us 2. Are these two equations equivalent? We first note that evaluating 5 at x = and using 8 as the expression for PΩ > T λ, µ; x,t x, T and simplifying gives us 2. Therefore 2 must be equivalent to 6 at x =. To prove this, we first note that leads to an alternative expression for the Pareto/NBD likelihood function Fader and Hardie 25: Lr, α, s, β x, t x, T = Γr xαr β s { where for α β while for α < β Lr, α, s, β x, t x, T Γr α T rx β T s A = 2 F, s ; ; αt x α t x rsx 2 F, s ; ; αt α T rsx A = 2 F, r x; ; βt x β t x rsx Γr x Γr s 2 F, r x; ; βt β T rsx. α r β s A, 23 Let us first consider the case of α β. Substituting 9 and 23 in 6 for x = gives us { s 2 F, s ; ; αt x α T t rx β T t s s 2F α t x rsx, s ; ; α T t rsx. 6
Equivalence to 2 implies α T t s s 2 F, s ; ; β T t = r x 2 F, s; ;. Noting that 2 F a, b; b;z = z a, 2F, s; ; α T t s =, β T t equivalence implies r x 2 F, s; ; s 2 F, s ; ; 2 F, s; ; =. 24 One of the so-called Gauss relations for contiguous functions states Abramowitz and Stegun 972, equation 5.2.24 is c b 2 F a, b; c; z b 2 F a, b ; c; z c 2 F a, b; c ; z =. 25 Therefore 24 is true, which means 6 evaluated at x = and 2 are equivalent when α β. Turning to the case of α < β, substituting 9 and 23 in 6 for x = gives us Lr, α, s, β x, t x, T Γr x Γr α r β s α T t rx β T t s s 2F { s 2 F, r x; ; βt x β t x rsx, r x; ; β T t rsx. Equivalence to 2 implies β T t rx s 2 F, r x; ; α T t = r x 2 F, r x ; ;. Using the result 2 F a, b; b;z = z a, we get 2F, r x; ; β T t rx =, α T t meaning equivalence implies s 2 F, r x; ; r x 2 F, r x ; ; 2 F, r x; ; =, 26 which we see from 25 is true. Therefore 6 evaluated at x = and 2 are also equivalent when α < β. 7
References Abramowitz, Milton and Irene A. Stegun eds. 972, Handbook of Mathematical Functions, New York: Dover Publications. Fader, Peter S. and Bruce G.S. Hardie 25, A Note on Deriving the Pareto/NBD Model and Related Expressions, <http://brucehardie.com/notes/9/>. Fader, Peter S. and Bruce G.S. Hardie 26, Deriving an Expression for PXt = x Under the Pareto/NBD Model, <http://brucehardie.com/notes/2/>. 8