A note on the best attainable rates of convergence for estimates of the shape parameter of regular variation Meitner Cadena arxiv:151.3617v1 [stat.me] 13 Oct 15 October 14, 15 Abstract Hall Welsh gave in 1984 the lowest bound so far to rates of convergence for estimates of the shape parameter of regular variation. We show that this bound can be improved. Keyword: Estimating parameters of regular variation Classification: 6G5, 6G Hall Welsh hereafter HW gave in 1984 a first lower bound of the accuracy of tail index estimation for a large class of distributions. Since then, this result has been a reference to evaluate rates of convergence for other estimators of this parameter, has motivated extensions of it for other classes of distributions say e.g. [1], [], [5], [6], [7]. Let F be a differentiable distribution function df defined on the positive half-line such that, for positive constants α, β, C A, F x = Cαx α 1 1+rx where rx Ax β, 1 as x +. Considering this type of dfs, HW [4] showed in 1984 that no estimator of α converges at a faster rate than n β/β+α on certain neighborhoods of Pareto distributions see e.g. [7] or []. More precisely, these authors defined classes D = Dα,C,ǫ,ρ,A of dfs F satisfying 1, in addition, α α ǫ, C C ǫ ρ = β / α for some given positive constants α, C, ǫ A. Let β = ρα. Then it was shown that see Theorem 1 in [4], if α n is an estimator of α, constructed out of a rom n-sample X 1,..., X n, satisfying then lim n inf P αn α an = 1, F D lim n β/β+α a n =. n We findthatthe proofofthis result, developedbyhw,allowsone, aftersomeadequate modifications, to also prove UPMC Paris 6 & CREAR, ESSEC Business School; E-mail: meitner.cadena@etu.upmc.fr or b454799@essec.edu or meitner.cadena@gmail.com 1
Theorem 1. / Suppose that for some α, C, ǫ ρ, we have for all A >. Then, for all ν β β +α, lim n ν a n =. n This means that n β/β+α = n β/β+α = n ρ/ρ+1 because of ρ = β / α = β / α, is no longer a lower bound to convergence rates for estimators of shape parameters in distributions with regularly varying tails, as claimed by Theorem 1 given in [4]. The proof of Theorem 1 is the same given by HW to prove Theorem 1 in [4], but redefining conveniently two parameters. We present these redefinitions show how with these changes the original proof can still be applied. In order to have a self-contained paper, we copy almost all of the proof of Theorem 1 in [4]. The main changes in that proof are pointed out. Let ν β / β +α. For proving Theorem 1 in [4], HW started constructing two densities f f 1, the first governed by fixed parameters α, C the second by varying parameters α 1, C 1, C, where α 1 = α + γ, γ = λn ν, λ >, β 1 = ρα 1 both C 1, C C as n. Here we point out that we use γ instead of γ. HW used γ in the proof of Theorem 1 in [4], where these authors defined it as γ = λn β1/β1+α1. Specifically, HW defined f x = C α x α 1, x C 1/α, f 1 x = { C 1 α 1 x α1 1 + x, x δ C α x α 1, δ < x C 1/α. where δ = n ν/β1, k = α 1 +β 1 1 x = x k, < x δ / 4 δ/ x k, δ/ 4 < x δ/ x δ / k, δ/ < x 3 δ/ 4 δ k, x 3 δ/ 4 < x δ. Here we point out that we use δ instead of δ. HW used δ in the proof of Theorem 1 in [4], where these authors defined it as δ = n 1/β1+α1. One can note that x is continuous on [ ; δ ], that = δ = x =. HW chose the constants C 1, C so that for large n, f 1 is a proper, continuous density on [ ] ;C 1/α ; that is,
C 1 α 1 δα 1 = C α δα 3 C 1 δα 1 +C C 1 δ α = 1. 4 Note that from 3 lim C 1 C = lim C α 1 δ γ 1 = n n α from 3 4 C C = C C δα C 1 α 1 1 δα = C C 1 1 C 1 1 δα α δα which gives lim C C C 1 C = lim δα1 γ =. n n α This guarantees that C 1,C C as n, as required for C 1 C. Then, the proof given by HW consisted initially of showing that = C C 1 α γ δ α1, 5 1/α as n, for all large n, f x f 1 x f x = O n 1 6 f 1 Dα,C,ǫ,ρ,A. 7 Note that trivially f D. The symbol K denotes a positive generic constant. By 5, as n, C C = O γ δ α1. 8 We also have, as n, α 1 α = γ +α C α x α 1 C 1 α 1 x α1 1 x α 1 1 C α x α 1 C C 1 α α 1 x α1 1 +C 1α 1x α1 α 1 = O δα 1 C C 1 δ γ + γ δα 1 ; 9 using 5 δ α C C 1 δ γ = C C + δ α C C 1 1 = C C 1 C 1 = C C 1 γ δ α1 1 C 1 δ α = C 1 γ δ γ C 1 δα = O γ ; 1 α α 3
Next, observing that δ x x α +1 K x k α+1 = O δα 1+β 1. 11 1/α = f x f 1 x f x 1 C α x α 1 C 1 α 1 x α1 1 x C α x α 1 1 1/α C + δ + C α x α 1 C α x α 1 C α x α 1 1 C α x α 1 C 1 α 1 x α1 1 C α x α 1 1 x C α x α 1 1/α 1 C + C C then, introducing 8, combining 9 1, using 11, give δ C 1 α x α 1, 1/α f x f 1 x f x 1 O γ δα 1 + δ β1+α1. 6 immediately follows taking γ δ instead of γ δ, as in the proof of Theorem 1 given in [4]. Considering γ δ, we now have O γ δα 1 + δ β1+α1 = O λ n ν να1/β1 +n νβ1+α1/β1 = O n νβ1+α1/β1, 6 then follows too since β 1 β 1 +α 1 = β β +α ν. The result 7 will follows if we prove that C α x α 1 C 1 α 1 x α1 1 Kx α1+β1 1 1 uniformly in δ < x C 1/α large n. By 5, C α x α 1 C α x α1 1 = α x α 1 C C = C C 1 α γ δ α1 α x α 1 = Kn ν n να1 β1 γ/β1 δ γ+β1 x α 1 Kn να/β1 x α1+β1 1 so 1 will follow if we show that for δ < x C 1/α, C α x α 1 C 1 α 1 x α1 1 Kx α1+β1 1. 13 4
But, using 3 4 gives, by γ = λ δ β1, C α = C C 1 δα 1 α α 1 +C1 α 1 δ γ, C α x α 1 C 1 α 1 x α1 1 = x α1 1 C C 1 δα 1 α α 1 x γ +C 1 α 1 δ γ x γ C 1 α 1 K 1 x α1 1 γ δ α1 γ +K x α1 1 1 δ/ x γ K 3 x α1+β1 1 δα 1 +K 4 x α1 1 γ log x / δ. 14 Now, x β1 γ log x / δ β1 = x / δ log x / δ, is maximized by taking x / δ = e 1/β 1. Therefore by 14, C α x α 1 C 1 α 1 x α1 1 K 3 x α1+β1 1 δα 1 +K 5 x α1+β1 1 K 6 x α1+β1 1 uniformly in δ < x C 1/α. This proves 13, completes the proof of 7. For what follows, the proof of Theorem 1 in [4] was inspired by Farrell 198 [3]. Observing that, using the Cauchy-Schwarz inequality, P αn f1 X 1,...,X n α 1 an [ = E f I n αn X 1,...,X n α 1 an f1 X i / f X i ] E f [ n P αn f X 1,...,X n α 1 1/ an [ n E f f1 X i / f X i ] 1/, 15 i=1 using 6. Hence i=1 ] 1/n f1 X i = f X i 1/α i=1 f1 x f x 1/α C = 1+ f 1 x f x f x 1 = 1+On 1, P αn f1 X 1,...,X n α 1 an K P αn f X 1,...,X n α 1 1/. an 16 By hypothesis by 7, the left-h side of 16 tends to 1 as n. Therefore P αn f X 1,...,X n α 1 an is bounded away from zero as n. Also by hypothesis, P αn f X 1,...,X n α 1 an tends to 1 as n, so { } { } P αn f X 1,...,X n α 1 an αn X 1,...,X n α an is bounded away from zero. Consequently, for large n, α1 α a n that is, γ = λn ν a n, so lim n ν a n λ n. Since this is true for each λ >, Theorem 1 is proved. 5
Acknowledgments The author gratefully acknowledges the support of SWISS LIFE through its ESSEC research program on Consequences of the population ageing on the insurances loss. References [1] Beirlant, Jan, Bouquiaux, Christel Werker Bas J. M. 6. Semiparametric lower bounds for tail index estimation. J. Stat. Plan. Inference 136 75 79. [] Drees, Holger 1998. Optimal Rates of Convergence for Estimates of the Extreme Value Index. Ann. Stat. 6 434 448. [3] Farrell, R. H.198. On the Best Obtainable Asymptotic Rates of Convergence in Estimation of a Density Function at a Point. Ann. Math. Stat. 43 17 18. [4] Hall, Peter Welsh, A. H. 1984. Best Attainable Rates of Convergence for Estimates of Parameters of Regular Variation. Ann. Stat. 1 179 184. [5] Novak, S. Y.14. On the Accuracy of Inference on Heavy-Tailed Distributions. Theory Probab. Appl. 58 59 518. [6] Pfanzagl, J.. On local uniformity for estimators confidence limits. J. Stat. Plan. Inference 84 7 53. [7] Smith, Richard L. 1987. Estimating Tails of Probability Distributions. Ann. Stat. 15 1174 17. 6