Combnng cluster sampg and k-tracng sampg to estmate the sze of a hdden populaton: asymptotc propertes of the estmators arxv:56.69v stat.me 2 Jun 25 Martín H. Fél Medna Techncal report Number: FCFM-UAS-25- Class: Research June, 25 Facultad de Cencas Físco-Matemátcas Unversdad Autónoma de Snaloa Cudad Unverstara, Culacán Snaloa Méco
Combnng cluster sampg and k-tracng sampg to estmate the sze of a hdden populaton: asymptotc propertes of the estmators Martín H. Fél Medna Facultad de Cencas Físco-Matemátcas de la Unversdad Autónoma de Snaloa Abstract Fél-Medna and Thompson 24 proposed a varant of k-tracng sampg to estmate the sze of a hdden populaton such as drug users, seual workers or homeless people. In ther varant a sampg frame of stes where the members of the populaton tend to gather s constructed. The frame s not assumed to cover the whole populaton, but only a porton of t. A smple random sample of stes s selected; the people n the sampled stes are dentfed and are asked to name other members of the populaton whch are added to the sample. Those authors proposed mamum lkelhood estmators of the populaton sze whch derved from a multnomal model for the numbers of people found n the sampled stes and a model that consders that the probablty that a person s named by any element n a partcular sampled ste k-probablty does not depend on the named person, that s, that the probabltes are homogeneous. Later, Fél-Medna et al. 25 proposed uncondtonal and condtonal mamum lkelhood estmators of the populaton sze whch derved from a model that takes nto account the heterogenety of the k-probabltes. In ths work we consder ths sampg desgn and set condtons for a general model for the k-probabltes that guarantee the consstency and asymptotc normalty of the estmators of the populaton sze and of the estmators of the parameters of the model for the k-probabltes. In partcular we showed that both the uncondtonal and condtonal mamum lkelhood estmators of the populaton sze are consstent and have asymptotc normal dstrbutons whch are dfferent from each other. Key words: Asymptotc normalty, capture-recapture, chan referral sampg, hard-todetect populaton, mamum lkelhood estmator, snowball sampg mhfel@uas.edu.m
Introducton Conventonal sampg methods are not approprate for sampg hdden or hard-to-reach human populatons, such as drug users, seual-workers and homeless people, because of the lack of sutable sampg frames. For ths reason, several specfc sampg methods for ths type of populaton have been proposed. See Magnan et al. 25 and Kalton 29 for revews of some of them. One of ths methods s snowball sampg, also known as ktracng sampg LTS or chan referral sampg. In LTS an ntal sample of members of the populaton s selected and the sample sze s ncreased by askng the people n the ntal sample to name other members of the populatons. The named people who are not n the ntal sample are added to the sample and they are asked to name other members of the populaton. The sampg process mght contnue n ths way untl a stoppng rule s satsfed. For revews of several varants of LTS see Spreen 992, Thompson and Frank 2 and Johnston and Sabn 2. Fél-Medna and Thompson 24 proposed a varant of k-tracng sampg LTS to estmate the sze of a hdden populaton. In ther varant they supposed that a sampg frame of stes where the members of the target populaton tend to gather can be constructed. As a eamples of stes are publc parks, bars and blocks. It s worth nothng that they do not supposed that the frame covers the whole populaton, but only a porton of t. Then an ntal sample of stes s selected by a smple random sampg wthout replacement desgn and the members of the populaton who belong to the sampled stes are dentfed. Fnally the people n the ntal sample are asked to named other members of the populaton and the named persons who are not n the ntal sample are ncluded n the sample. Those authors proposed models to descrbe the number of members of the populaton who belong to each ste n the frame and to descrbe the probablty that a person s ked to a sampled ste, that s, that he or she was named by at least one person who belongs to that ste. From those models they derved mamum lkelhood estmators of the populaton sze. In that work those authors consdered that the probablty that a person s ked to a ste k-probablty does not depend on the person, but does on the ste, that s, they consder homogeneous k-probabltes. Fél-Medna and Monardn 26 consdered ths same varant of LTS and derved estmators of the populaton sze usng a Bayesan-asssted approach, that s, they derved the estmators usng the Bayesan approach, but the nferences were made under a frequentst approach. Those authors consdered an homogeneous two-stage normal model for the logts of the k-probabltes. Later Fél-Medna et al. 25 etended the work by Fél-Medna and Thompson 24 to the case n whch the k-probabltes are heterogeneous, that s, that they depend on the named people. Those authors modeled the heterogenety of the k-probabltes by means of a med logstc normal model proposed by Coull and Agrest 999 n the contet of capture-recapture studes. From ths model they derved uncondtonal and condtonal mamum lkelhood estmators of the populaton sze. In ths work we consder the varant of the LTS proposed by Fél-Medna and Thompson 24 and a general model for the k-probabltes from whch we derve the forms of the 2
uncondtonal and condtonal mamum lkelhood estmators of the populaton sze. We state condtons that guarantee the consstency and asymptotc normalty of both types of estmators, and we proposed estmators of the varances of the estmators of the populaton sze. It s worth notng that our work s based on that by Sanathanan 972 n whch she derved asymptotc propertes of both uncondtonal and condtonal mamum lkelhood estmators of the sze of a multnomal dstrbuton from an ncomplete observaton of the cell totals whch s a stuaton that occurs n capture-recapture studes. Thus, our work s bascally an adaptaton of that by Sanathanan 972 to the estmators used n the sampg varant proposed by Fél-Medna and Thompson 24. The structure of ths document s the followng. In secton 2 we descrbe the varant of LTS proposed by Fél-Medna and Thompson 24. In secton 3 we present probablty models that descrbe the numbers of people that belong to the stes n the frame and the probabltes of ks between the members of the populaton and the stes. From these models we construct the lkelhood functon that allows us to derve the uncondtonal and condtonal mamum lkelhood estmators of the parameters of the assumed model for the k-probabltes and of the populaton sze. In addton, we present condtons that guarantee the consstency of the proposed estmators. In secton 4, whch s the central part of ths paper, we defne the asymptotc framework under whch are derved the asymptotc propertes of the proposed estmators. In secton 5 we proposed a method for estmatng the varancecovarance matrces of the estmators of the dfferent vectors of parameters that appear n the assumed models. Fnally, n secton 6 we dscuss some ponts to be consdered whenever the results of ths paper want to be used n actual stuatons. 2 Lnk-tracng sampg desgn In ths secton we wll descrbe the LTS varant proposed by Fél-Medna and Thompson 24. Thus, let U be a fnte populaton of τ people. Let U be the porton of U that s covered by a sampg frame of N stes A,...,A N, whch are places where members of the populaton tend to gather. We wll assume that each one of the τ persons who are n U belongs to only one stea n the frame. Notce that ths does not mply that a person cannot be found n dstnct places, but that, as n ordnary cluster sampg, the researcher has a crteron that allows hm or her to assgn a person to only one ste. Let M be the number of people n U that belong to the ste A, =,...,N. The prevous assumpton mples that τ = N M. Letτ 2 = τ τ be the number of people that belong to the portonu 2 = U U ofu that s not covered by the sampg frame. The sampg procedure s as follows. An ntal smple random sample wthout replacement SRSWOR S A of n stes A,...,A n s selected from the frame and the members of the populaton who belong to each sampled ste are dentfed. Let S be the set of people n the ntal sample. Notce that the sze of S s M = n M. Then from each sampled stea, =,...,n, the people who belong to that ste are asked to name other members of the populaton. A person and a sampled ste are sad to be ked f any of the persons who belong to that ste names that person. Let S and S 2 be the sets of people n U S and n 3
U 2, respectvely, who are ked to at least one ste n S A. Fnally, from each named person the followng nformaton s obtaned: the porton of U where that person s located, that s, U S,A S A oru 2, and the subset of sampled stes that are ked to hm or her. 3 Uncondtonal and condtonal mamum lkelhood estmators 3. Probablty models As n Fél-Medna and Thompson 24, we wll suppose that the numbers M,..., M N of people who belong to the stesa,...,a N are ndependent Posson random varables wth mean λ. Therefore, the ont condtonal dstrbuton of M,...,M n,τ M gven that N M = τ s multnomal wth probablty mass functon pmf: fm,...,m n,τ m τ = τ! m n m n τ m.!τ m! N N To model the ks between the members of the populaton and the sampled stes we wll defne for person nu k S the vector of k-ndcator varablesx k = X k,...,xk n, where X k = f person s ked to ste A and X k = otherwse. Notce that X k ndcates whch stes n S A are ked to person. We wll suppose that gven S A, and consequently the values M s of the sampled stes, the X k s are Bernoull random varables wth means p k s and that the vectors Xk are ndependent. Let Ω = {,..., n : =, ; =,..., n}, that s, the set of all the n-dmensonal vectors such that each one of ther elements s or. For =,..., n Ω we wll denote by π k the probablty that the vector of k-ndcator varables assocated wth a randomly selected person from U k S equals, that s, the probablty that the person s ked only to the stesa such that the-th element of equals. We wll suppose that π k depends on a q k -dmensonal parameter θ k = θ k,...,θk q k Θ k R q k k, that s, π = π k θ k, k =,2. In ths work we wll assume thatθ k does not depend on the observed M s. Smlarly, for person n A S A, we wll defne the vector of k-ndcator varables X A = X A,..., X A,XA,...,XA n, where X A = f person s ked to ste A, =,...,n, and X k = otherwse. We wll suppose that gven S A the X A s are Bernoull random varables wth means p s and that the vectors XA are ndependent. For eacha S A, letω = {,...,,,..., n : =,;, =,...,n}, that s, the set of all n -dmensonal vectors obtaned from the vectors n Ω by omttng ther -th coordnate. For =,...,,,..., n Ω we wll denote by π A the probablty that the vector of k-ndcator varables assocated wth a randomly selected person froma equals. We wll suppose thatπ A depends on theq -dmensonal parameter θ = θ,...,θ q Θ, that s, π A = π A θ, =,...,n. 4
For nstance, Fél-Medna and Monardn 26 modeled the k-probablty between person nu k A and stea S A byp k =Pr X k = S A =ep α k / ep α k, where the condtonal dstrbuton of α k gven ψ k s normal wth mean ψ k and varance σk 2, whch we denote by αk ψ k N ψ k,σk 2 and ψ k N µ k,γk 2. Thus, n ths case θ k = µ k,γ k,σ k Θ k = R,,, and t π k θ epα k= epα f kα ψf k ψdαdψ epα f kα ψf k ψdαdψ n t, where =,..., n Ω, t = n, and f k α ψ and f k ψ denote the probablty densty functons of the dstrbutons N ψ k,σk 2 and N µ k,γk 2, respectvely. It s worth notng that those authors dd not computeπ k θ k because they followed a Bayesan approach and focused on computng the posteror dstrbuton of the parameters. As another eample, Fél-Medna et al. 25 modeled the k-probablty between person nu k A and stea S A by the followng Rasch model: p k =Pr X k = S A = ep α k β k / ep α k β k, where α k s a fed not random effect assocated wth the ste A and β k s a normal random effect wth mean zero and varance σk 2 assocated wth person nu k A. Therefore π k θ n k = = ep ep α k σ k z φzdz, α k σ k z where =,..., n Ω,θ k = α k,...,αk n,σ k Θ k = R n, andφ denotes the probablty densty functon of the standard normal dstrbuton. Those authors compute π k θ k by means of Gaussan quadrature formula. Notce that n the frst eample the parameter θ k s defned prevously to the selecton of the ntal sample because the α k s are a random sample from a probablty dstrbuton ndeed by θ k and consequently ths parameter does not represent characterstcs of the partcular selected sample. On the other hand, n the second eample the parameterθ k s defned once the ntal sample of stes s selected because the α k s represent characterstcs of the partcular stes ns A. Therefore, as long asθ k does not depend on them s the results derved n ths work are vald for both cases. 3.2 Lkelhood functon To compute the lkelhood functon we wll factorze t nto dfferent components. One component, L MULT τ, s gven by the probablty of observng the partcular szes m,...,m n of the stes ns A ; therefore, t s specfed by the multnomal dstrbuton. Two addtonal 5
factors are gven by the probabltes of the confguratons of the ks between the people n U k S, k =,2, and the stes A S A. To obtan those factors we wll denote by R k, =,..., n Ω, the random varable that ndcates the number of dstnct people n U k S whose vectors of k-ndcator varables are equal to, and by R k the random varable that ndcates the number of dstnct people nu k S who are ked to at least one ste A S A. Notce that R k = Ω {} Rk, where denotes the n-dmensonal vector of zeros. Because of the assumptons we made about the vectors X k of k-ndcator varables we have that the condtonal ont probablty dstrbuton of the varables{r } Ω gvens A s a multnomal dstrbuton wth parameter of sze τ m and probabltes{π θ } Ω, whereas that of the varables{r 2 } Ω s a multnomal dstrbuton wth parameter of szeτ 2 and probabltes {π 2 θ 2 } Ω. Therefore, the factors of the lkelhood functon assocated wth the probabltes of the confguratons of ks between the people nu k S, k =,2, and the stesa S A are and L τ,θ = L 2 τ 2,θ 2 = τ m! τ m r! Ω r! τ 2! τ 2 r 2! Ω r2! Ω Ω π θ r π 2 θ 2 r 2. Notce that r = τ m r andr 2 = τ 2 r 2. The last factor of the lkelhood functon s gven by the probablty of the confguraton of ks between the people ns and the stesa S A. To obtan ths factor, we wll denote by R A, =,...,,,..., n Ω, the random varable that ndcates the number of dstnct people n A S A such that ther vectors of k-ndcator varables equal and byr A the random varable that ndcates the number of dstnct people na S A who are ked to at least one ste A S A,. Notce that R A = Ω {} RA, where denotes then -dmensonal vector of zeros andr A = m R A. Then, as n the prevous cases, the condtonal ont probablty dstrbuton of the varables{r A } Ω gvens A s a multnomal dstrbuton wth parameter of sze m and probabltes {π A θ } Ω. Therefore, the probablty of the confguraton of ks between the people n S and the stes A S A s gven by the product of the prevous multnomal probabltes one for each A S A, and consequently the factor of the lkelhood functon assocated wth that probablty s L θ = n = m! Ω ra! π A θ r A Ω π A θ m ra. From the prevous results we have that the mamum lkelhood functon s gven by Lτ,τ 2,θ,θ 2 = L τ,θ L 2 τ 2,θ 2, 6 2
where L τ,θ = L MULT τ L τ,θ L θ and 3 L 2 τ 2,θ 2 = L 2 τ 2,θ 2. 3.3 Uncondtonal and condtonal mamum lkelhood estmators of τ k,θ k In ths secton we wll derve uncondtonal and condtonal mamum lkelhood estmators of the parameters of the prevously specfed models. Henceforth we wll suppose that condtonal on the ntal samples A of stes the followng regularty condtons are satsfed: θ k s the true value ofθ k. 2 θ k s an nteror pont ofθ k. 3 π k θ k >, Ω and πa θ >, Ω, =,...,n. 4 π k θ k / θ k, Ω and π A θ / θ, Ω, =,...,n; =,...,q k, est at anyθ k Θ k and θ Θ, and are contnuous n neghborhoods ofθ k and θ, respectvely. 5 Gven aδ >, t s possble to fnd an ε > such that nf θ θ >δ Ω {} π N n π θ π θ θ n π A = Ω π θ / π θ / θ θ θ θ ε π A. θ π π π A 6 Gven aδ 2 >, t s possble to fnd an ε 2 > such that nf π 2 θ 2 θ 2 θ 2 >δ 2 Ω {} π 2 θ 2 π 2 θ 2 / π 2 θ 2 / π 2 θ 2 ε 2. π 2 θ 2 Remark. For a dfferentable functon f : R q R, the notaton f / represents f/ =. 7
The regularty condtons -4 and 6 or condtons equvalent to them have been assumed by several authors such as Brch 964, Rao 973, Ch. 5, Bshop et al. 975, Ch. 4, Sanathanan 972 and Agrest 22, Ch. 4, among others, n the contet of dervng asymptotc propertes of estmators of the parameters of models for the probabltes of a multnomal dstrbuton. The partcular form of condton 6 comes from Sanathanan 972 who took t from the frst edton of Rao 973, Ch. 5 and t s known as a strong dentfablty condton. Condton 5 s a modfcaton of 6 to meet the requrements of our partcular sampg desgn. In general, these condtons mply the estence and consstency of the UMLEs and CMLEs of θ and θ 2, and that they can be obtaned dervng the lkelhood functon wth respect toθ andθ 2. 3.3. Uncondtonal and condtonal mamum lkelhood estmators ofτ andθ Let us frstly consder the uncondtonal mamum lkelhood estmators UMLEs ˆτ U and ˆθ U ofτ and θ. The log-lkelhood functon ofτ and θ s l τ,θ = L τ,θ = τ! τ m r!τ n/n Ω r π θ n r A π A θ C, = Ω where C does not depend on τ and θ, and recall that r = τ m r and r A m r A U. Then, the UMLE ˆθ ofθ s the soluton to the followng equatons: l τ,θ = r π θ θ Ω π θ θ n = Ω = A r π A θ =, =,...,q π A θ θ. 4 Snceτ s an nteger we wll use the rato method to mamzel τ,θ. See Feller 968, Ch. 3. Thus L τ,θ L τ,θ = τ n/nπ θ. τ m r Snce ths rato s greater than or equal to f τ mr / n/nπ θ and t s smaller than or equal tofτ s greater than or equal to that quantty, t follows that ˆτ U s gven by ˆτ U = M R n/nπ ˆθ U, 5 where denotes the largest nteger not greater than. Notce that the rght hand-sde of 5 s not a closed form for ˆτ U U snce ths epresson depends on ˆθ. In fact, ˆτ U U and ˆθ are 8
obtaned by smultaneously solvng the set of equatons 4 and 5, whch s generally done by numercal methods. Let us now consder the condtonal mamum lkelhood estmators CMLEs ˆτ C and ˆθ C ofτ and θ. It s worth notng that ths type of estmators was proposed by Sanathanan 972 n the contet of estmatng the parameter of sze of a multnomal dstrbuton from an ncomplete observaton of the cell frequences. The approach we wll follow to derve ˆτ C C and ˆθ s an adaptaton of Sanathanan s 972 approach to our case. Thus, from 2 we have that L τ,θ =f {r } Ω {m },τ,θ =f {r } Ω {} r,{m },τ,θ f r {m },τ,θ r! π r θ = Ω {} r! Ω {} π θ τ m! π τ m r!r! θ r π θ τ m r =L θ L 2 τ,θ 6 Notce that the frst factor L θ s gven{ by the ont pmf of the multnomal } dstrbuton wth parameter of sze r and probabltes π θ / π θ and that ths Ω {} dstrbuton does not depend onτ. Note also that the second factorl 2 τ,θ s gven by the pmf of the bnomal dstrbuton wth parameter of szeτ mand probablty π θ. C Thus, the CMLE ˆθ ofθ s the soluton to the followng system of equatons: θ L θ L θ = Ω {} n = Ω r π θ r π θ π θ θ π θ θ The CMLE ˆτ C ofτ s obtaned by the rato method. Thus, snce A r π A θ =, =,...,q π A θ θ. 7 L MULT τ L 2 τ,θ L MULT τ L 2 τ,θ = τ n/nπ θ, τ m r t follows that ˆτ C = M R n/nπ ˆθC. 8 Note that 8 s a closed form for ˆτ C C snce ˆθ s frstly obtaned from 7. 9
3.3.2 Uncondtonal and condtonal mamum lkelhood estmators ofτ 2 andθ 2 By a smlar analyss as that conducted n the prevous subsecton we have that the UMLEs ˆτ U U 2 and ˆθ 2 ofτ 2 and θ 2 are the soluton to the followng equatons: Ω r 2 π 2 θ 2 =, =,...,q π 2 θ 2 θ 2 2 and ˆτ U 2 = π 2 R 2 ˆθU 2. 9 where recall thatr 2 = τ 2 r 2. Wth respect to the condtonal estmators, we have that the CMLE ˆθ C 2 of θ 2 soluton to the followng equatons: Ω {} r 2 π 2 θ 2 r 2 π 2 θ 2 π 2 θ 2 θ 2 π 2 θ 2 θ 2 =, =,...,q 2. s the The CMLE ˆτ C 2 ofτ 2 s gven by 9, but replacng a closed form for ˆτ C 2. U C ˆθ 2 by ˆθ 2. Note that n ths case 9 s 3.3.3 Uncondtonal and condtonal mamum lkelhood estmators ofτ = τ τ 2 The UMLE and CMLE ofτ = τ τ 2 are gven by ˆτ U = ˆτ U ˆτ U 2 and ˆτ C = ˆτ C ˆτ C 2, respectvely. 4 Asymptotc propertes of the uncondtonal and condtonal mamum lkelhood estmators The structure of ths secton s as follows. Frstly we wll defne the asymptotc framework under whch we wll derve the asymptotc propertes of the estmators. Net we wll state and proof a theorem that guarantees the asymptotc multvarate normal dstrbuton of any estmator of τ,θ that satsfes the condtons epressed n the theorem. Snce not any estmator ofτ,θ satsfes the condtons of the theorem, n partcular the CMLE does not, we wll state and proof another theorem that guarantees the asymptotc multvarate normal dstrbuton of any estmator of θ that satsfes the condtons of that theorem. Then, we wll prove that the UMLE of τ,θ satsfes the condtons of the frst theorem, whereas the CMLE of θ satsfes those of the second one. In addton, we wll prove that n spte of that result, the CMLE ˆτ C does have an asymptotc normal dstrbuton although t s not the same as that of ˆτ U. After that we wll consder the asymptotc propertes of estmators of
τ 2,θ 2. Snce ths problem s eactly the same as that consdered by Sanathanan 972, we wll only state a theorem that guarantees the asymptotc multvarate normal dstrbuton of any estmator of τ 2,θ 2 that satsfes the condtons epressed n the theorem, but we wll omt ts proof, as well as the proofs that both the UMLE and the CMLE ofτ 2,θ 2 satsfy the condtons of that theorem. Fnally, we wll obtan the asymptotc propertes of the estmators ˆτ U and ˆτ C ofτ. 4. Basc assumptons To derve the asymptotc propertes of the UMLEs and CMLEs of τ k and θ k, k =,2, we wll make the followng assumptons: A. τ k, k =,2. B. τ k /τ α k, < α k <,k =,2. C. N andnare fed postve nteger numbers. For convenence of notaton, we wll put τ k ether as a subscrpt or a superscrpt of every term that depends onτ k,k =,2. In addton, convergence n dstrbuton wll be denoted by D and convergence n probablty by. P Notce that from t follows that the condtonal dstrbuton of M τ gvenτ s bnomal wth parameter of sze τ and probablty /N, that s M τ τ Bnτ,/N; consequently M τ /τ s stochastcally bounded, that s, M τ = O p τ. Ths means that the sze of U τ s ncreased by ncreasng the szes of the clusters, even though ther number N s kept fed. In the same manner, the number of people n the ntal sample S τ, gven by M τ = n Mτ τ Bnτ,n/N, s ncreased because of the ncreasng of M τ, =,...,n, even though n s kept fed. On the other hand, snce τ M τ τ Bnτ, n/n,r τ S τ A Bnτ M τ, π andrτ 2 2 S τ A Bnτ 2, π 2, t follows that R τ τ Bn τ, n/n π and R τ 2 2 τ 2 Bnτ 2, π 2 ; therefore R τ = O p τ and R τ 2 2 = O p τ 2. Thus, the szes of the sets S τ and S τ 2 2 are ncreased because τ and τ 2 are ncreased even though the probabltes {π } Ω and {π 2 } Ω are kept fed. We wll end ths subsecton presentng the condtonal and uncondtonal dstrbutons of the varables R τ, R A and R τ 2 whch wll be used later n ths work. Thus, from the multnomal dstrbutons ndcated n Subsecton 3. t follows that R τ S τ A Bnτ M τ,π,r A M τ BnM τ,π A andr τ 2 S τ A Bnτ 2,π 2 ; thereforer τ τ Bn τ, n/nπ, R A τ Bn τ,π A /N and R τ 2 τ 2 Bnτ 2,π 2.
4.2 Asymptotc multvarate normal dstrbuton of estmators ofτ,θ Theorem. Let θ = θ,...,θ q be the true value of θ. Let ˆτ τ and ˆθ τ = ˆθ τ,..., ˆθ τ q be estmators ofτ andθ, such that ˆθ τ τ /2 P τ /2 θ. { ˆτ τ θ M τ R τ l τ ˆττ, ˆθ τ / n/nπ P, =,...,q. } ˆθτ P. In addton, letσ be theq q matr whose elements are Σ n/nπ, = θ / n/nπ θ, = Σ, /π, = θ π θ / θ, =,...,q,, = Σ, = n /π θ N π θ / θ Σ Σ N n l= Ω /π A l θ π A l θ Ω / θ l and whch s assumed to be a non-sngular matr. Then τ /2 ˆτ τ τ,τ /2 ˆθτ θ D N q,σ, whereσ s the nverse of Σ and =,..., R q. π θ / θ π A l θ / θ,, =,...,q, 2
Proof. Evaluatng equaton 4 at ˆτ τ, ˆθ τ we get θ = Ω n l τ ˆτ τ, ˆθ τ R τ π ˆθ τ l= Ω l R τ = Ω n = Ω π ˆθ τ l= Ω l R τ π ˆθ τ ˆττ θ A R l,τ π A l ˆθ τ π ˆθ τ π ˆθ τ θ A R l,τ π A l ˆθ τ π ˆθ τ θ π A l ˆθ τ θ M τ R τ π ˆθ τ ˆτ τ M τ R τ π A l ˆθ τ θ ˆττ τ π ˆθ τ π ˆθ τ θ π ˆθ τ π ˆθ τ θ τ M τ R τ n l= Ω l A R l,τ π A l ˆθ τ π ˆθ τ θ π A l ˆθ τ. θ Snce Ω from we get that π ˆθ τ / θ = and Ω l π Al ˆθ τ / θ =, τ /2 n l= { Ω R τ τ M τ π π ˆθ τ A R l,τ M τ l π A l Ω l π A l = τ /2 ˆτ τ τ τ /2 n l= { τ M τ M τ l τ τ ˆθ τ π ˆθ τ Ω A π l ˆθ τ Ω l θ θ π ˆθ τ θ π ˆθ τ π π ˆθ τ π A l π A l ˆθ τ π ˆθ τ θ π A l ˆθ τ θ θ θ τ /2 π ˆθ τ θ π A l ˆθ τ θ θ l τ ˆτ τ, ˆθ τ. 2 3
Let Y τ = R τ τ M τ π θ, Y A l,τ = R A l,τ M τ l π A l θ and Z τ = τ /2 = τ /2 Ω Ω R τ π θ π Y τ π θ π θ θ θ θ n R l= Ω l π A l n A l,τ A Y l,τ l= Ω l π A l A l π θ θ θ A π l θ, θ θ where the last equalty s obtaned usng but replacng ˆθ τ by θ. Then, the dfference between the left-hand sde of 2 and Z τ s gven by τ /2 τ /2 = τ /2 n l= τ /2 Ω θ { Ω Y τ π ˆθ τ l τ Y τ π ˆθ τ θ ˆτ τ, ˆθ τ Y Al,τ Ω l θ π ˆθ τ π A l ˆθ τ l τ ˆτ τ, ˆθ τ Z τ n l= Ω l π ˆθ τ θ Y A l,τ π A l ˆθ τ π θ π A l ˆθ τ θ π π A l ˆθ τ θ θ π A l θ π A l θ θ θ. 3 Snce uncondtonally EY τ = and VY τ = τ n/nπ θ π θ, and also EY A l,τ = and VY A l,τ = τ /Nπ A l θ πa l θ, t follows that τ /2 Y τ = O p and τ /2 Y A l,τ = O p. Consequently, these results along wth condtons 3-4 and condtons and of the theorem mply that 3 converges to zero n probablty. On the other hand, by the mean value theorem for functons of several varables we have that π ˆθ τ π θ = q = π A l ˆθ τ π A l θ = q = ˆθτ θ ˆθτ θ π θ τ / θ and 4 π A l θ τ A l / θ, where θ τ and θτ A l are between ˆθ τ and θ. Snce the dfference between the rght-hand 4
sde of 2 and Z τ also converges to zero n probablty, we have that = where τ /2 ˆτ τ τ τ /2 n l= ˆΣ { τ M τ M τ l τ, τ π ˆθ τ Ω π ˆθ τ Ω l π A l ˆθ τ τ /2 ˆτ τ τ π ˆθ τ θ π ˆθ τ θ π A l ˆθ τ θ q = ˆΣ q = q =, ˆθτ π θ θ τ θ ˆθτ A π l θ θ τ A l θ τ /2 ˆθτ θ Zτ Z τ P, 5 ˆΣ, ˆΣ, = π ˆθ τ = τ M τ τ n l= M τ l τ π ˆθ τ θ Ω π ˆθ τ and Ω l π A l ˆθ τ π ˆθ τ θ π A l ˆθ τ θ π θ τ θ π A l θ τ A l θ. 6 π Epresson 5 suggests the followng equalty n terms of ˆτ τ τ and π ˆθ τ θ : { τ /2 ˆτ τ n/nπ τ /2 n/n n/nπ ˆθ τ ˆθ τ τ /2 π ˆθ τ π θ. } M τ R τ { M τ R τ = τ /2 ˆτ τ τ } τ n/nπ θ By condton of the theorem t follows that the left hand-sde of the prevous equaton converges to zero n probablty. Therefore, f we dvde the rght hand-sde of ths equaton by n/nπ θ and use 4, we wll get that the followng epresson also converges to zero n probablty, that s 5
τ /2 ˆτ τ n/nπ τ n/nπ θ M τ R τ τ /2 q = = ˆΣ τ /2, ˆθτ θ τ /2 ˆθ τ τ n/nπ θ n/nπ θ π π θ θ ˆτ τ τ q whereθ τ s between ˆθ τ and θ and and ˆΣ, θ τ = = n/nπ ˆθ τ n/nπ θ, Z τ = τ /2 ˆΣ,, ˆΣ, τ /2 ˆθτ θ Z τ P, 7 = π π θ θ τ M τ R τ τ n/nπ θ n/nπ θ. θ Let W τ = τ /2 ˆτ τ τ,τ /2 ˆθ τ θ and Z τ = Z τ,z τ 2,...,Z τ q, by the prevous results we have that 8 ˆΣ Wτ Z τ P, 9 where ˆΣ s theq q matr whose elements are defned n 6 and 8. Notce that from the defntons of the matrcesσ and ˆΣ, condtons 3-4 and condton of the theorem along wth the fact that τ M τ /τ P n/n and M τ l /τ P /N, t follows that ˆΣ P Σ. We wll show that Z τ D Z N q,σ as τ. To do ths, we wll assocate wth each element t U, t =,...,τ, a random vector V t = V t,,...,v t,q such that a V t, = and V t, = π θ π θ / θ assocated vectorx t b V t, = π n/nπ θ, =,...,q, f t U S and ts of k-ndcator varables equals the vector Ω {}; / n/nπ θ and V t, = π θ θ / θ, =,...,q, f t U S and ts assocated vector X t of k-ndcator varables equals the vector Ω, and 6
c V t, = and V t, = πa l θ π A l θ / θ, =,...,q, f t A l S A and ts assocated vectorx t of k-ndcator varables equals the vector Ω l. τ /2 and τ /2 Snce τ t= τ t= V t =τ /2 =Z τ, V t, =τ /2 M τ R τ Ω R τ π θ π θ θ t follows that Z τ = τ /2 τ t= V t. From the defnton ofv t, we have that { Pr V t, = } = n/n { Pr V t, = n/nπ { } Pr V t, = π θ π θ / θ θ / τ M τ R τ n A R l,τ l= Ω l π A l n/nπ θ n/nπ θ θ π θ n/n, θ π A l θ = Z τ =,...,q ; n/nπ θ } = n/nπ θ, = n/nπ θ, Ω, =,...,q, and { } Pr V t, = πa l θ π A l θ / θ =/Nπ A l θ, Ω l, =,...,q, l =,...,n; therefore, the epected values of the varablesv t, are E V t, = n/n π θ n/n n/nπ θ = and E V t, = Ω because of. Thus,E V V t, π θ / θ n/n V t n l= Ω l π Al θ / θ /N =, =,...,q, =, t =,...,τ. Furthermore, ther varances are = n/n π θ n/n = n/nπ θ n/nπ θ 7 2 n/nπ θ n/nπ θ
and V V t, = n/n Ω N n l= π θ Ω l π A l θ π θ θ π A l 2 θ 2, =,...,q, θ and ther covarances are Cov Cov V t,,v t, = π Ω {} θ θ n/n = π θ π n/n n/nπ θ n/nπ n l= Ω l π A l θ N θ θ, =,...,q, and θ V t,,v t, = n/n Ω n N l= π θ θ π Ω l π A l θ θ π A l θ θ π θ θ π A l θ, θ, =,...,q,. θ π θ θ Therefore, the varance-covarance matr ofv t sσ. Fnally, snce the V t, t =,...,τ, are ndependent and dentcally dstrbuted random vectors, by the central lmt theorem t follows that Consequently by 9, as ˆΣ P Σ. W τ = Z τ = τ /2 τ /2 τ t= V t D Z N q,σ. ˆτ τ τ,τ /2 ˆθ τ θ D Σ Z N q,σ 4.3 Asymptotc multvarate normal dstrbuton of estmators ofθ Theorem 2. Let θ = θ,...,θ q be the true value of θ. Let ˆθ τ = ˆθ τ,..., ˆθ τ q be an estmator of θ, such that ˆθ τ P θ. 8
{ τ /2 θ L τ ˆθ τ L τ ˆθ τ } P, =,...,q. In addton, letψ be theq q matr whose elements are Ψ = Ψ,, = n/n π θ Ω {} n N l= / π θ π θ / θ π θ / θ /π A l θ π A l θ Ω / θ π A l θ / θ, l, =,...,q, where π θ = π θ / π θ, Ω {}, and suppose that Ψ s a nonsngular matr. Then τ /2 ˆθτ θ D N q,ψ, whereψ s the nverse of Ψ and =,..., R q. Furthermore, f ˆτ τ s an estmator ofτ such that { } τ /2 ˆτ τ M τ R τ / n/nπ ˆθτ P, then where σ 2 = n/n n/nπ θ 2 θ / θ,..., π θ / θ q s the gradent of π θ eval- and π θ = uated atθ. π τ /2 ˆτ τ τ D N,σ 2, π θ Ψ n/n π θ π n/nπ θ θ, Proof. From the defntons ofl τ θ and L τ θ we have that Snce θ L τ ˆθ τ L τ ˆθ τ Ω {} = Ω π ˆθ τ / θ = and 9 n R τ π ˆθ τ l= Ω l π ˆθ τ θ A R l,τ π A l ˆθ τ π A l ˆθ τ θ. 2 Ω l π Al ˆθ τ / θ =, 22
from 2 we get that τ /2 n l= = τ /2 θ n l= Ω {} Ω l R R τ τ M τ l τ Let Y τ = R τ R τ Z τ = τ /2 = τ /2 π Ω {} Ω {} R τ R τ π π ˆθ τ A l,τ M τ l π A l π A l ˆθ τ θ θ L τ ˆθ τ L τ ˆθ τ Ω {} π ˆθ τ π A π l ˆθ τ Ω l π ˆθ τ π A l π A l ˆθ τ θ,y A l,τ = R A l,τ R τ π θ π θ θ Y τ π θ π θ θ π ˆθ τ θ π A l ˆθ τ θ θ θ π ˆθ τ θ π A l ˆθ τ θ M τ l π A l n A R l,τ l= Ω l π A l n Y l= Ω l π A l where the last equalty s obtaned usng 22 but replacng ˆθ τ by θ between the left-hand sde of 23 and Z τ s gven by τ /2 τ /2 = τ /2 n l= τ /2 Ω {} θ Ω {} Y τ π ˆθ τ π ˆθ τ θ L τ ˆθ τ L τ ˆθ τ Y τ Y Al,τ Ω l θ π ˆθ τ π A l ˆθ τ L τ ˆθ τ L τ ˆθ τ n l= Ω l Z τ π ˆθ τ θ π A l ˆθ τ θ 2 Y. 23 θ and π A l θ θ θ A l,τ π A l θ, θ A l,τ π A l ˆθ τ π θ π θ. Then, the dfference π A l ˆθ τ θ θ θ A π l θ θ π A l θ. 24
Snceτ /2 Y τ = O p andτ /2 Y A l,τ = O p, these results along wth condtons 3-4 and condtons and of the theorem mply that 24 converges to zero n probablty. On the other hand, by the mean value theorem of several varables we have that π ˆθ τ π θ = q = π A l ˆθ τ π A l θ = q = ˆθτ θ ˆθτ θ π θτ / θ and 25 π A l θ τ A l / θ, where θ τ and θ τ A l are between ˆθ τ and θ. Snce the dfference between the rght-hand sde of 23 and Z τ also converges to zero n probablty, we have that = where τ /2 q = n l= R τ τ M τ l τ ˆΨ, ˆΨ, Ω {} π ˆθ τ Ω l π A l ˆθ τ τ /2 ˆθτ θ = Rτ τ n l= Ω {} M τ l τ π ˆθ τ θ π A l ˆθ τ θ Z τ π ˆθ τ q = q = ˆθτ π θ θ τ θ ˆθτ A π l θ θ τ A l θ Zτ P, 26 π ˆθ τ Ω l π A l ˆθ τ θ π A l ˆθ τ π θ τ θ θ Notce that from the defntons of the matrces Ψ and ˆΨ π A l θ τ A l θ. 27, condtons 3-4 and θ and condton of the theorem along wth the fact thatr τ /τ P n/n π /τ P /N, t follows that ˆΨ P Ψ M τ l. By condton of the theorem and usng eactly the same procedure as that used to obtan epresson 7 we wll get that epresson whch we wll put n the followng terms: â τ /2 ˆτ τ τ q τ /2 ˆθτ θ Z τ P, 28 where â = n/nπ ˆθ τ n/nπ Z τ = τ /2 =â θ, â = π θ τ π θ, =,...,q θ, M τ R τ τ n/nπ θ n/nπ θ, 29 2
and θ τ s between ˆθ τ and θ theorem mply that â P a, =,..., q, where a = n/nπ θ, anda = Let Z τ =. Notce that condtons 3-4 and condton of the n/nπ θ / π θ / θ /π θ, =,...,q. Z τ,z τ 2,...,Z τ q, then by the prevous results we have that ˆΨ τ /2 ˆθτ θ Z τ P, 3 where ˆΨ s theq q matr whose elements are defned n 27. We wll show thatz τ D Z N q,ψ asτ, wherez = Z,...,Z q, and that Z τ D Z N,a, where Z τ s gven by 29. To do ths, we wll assocate wth each element t U, t =,...,τ, a random vector V t = V t, varablev t such that,...,v t,q and a random a V t, = π θ π θ / θ, =,...,q, and V t =, f t U S and ts assocated vectorx t of k-ndcator varables equals the vector Ω {}; b V t, =, =,...,q, and V t = n/nπ θ / n/nπ θ, ft U S and ts assocated vectorx t of k-ndcator varables equals the vector Ω, and c V t, = π A l θ π A l θ / θ, =,...,q, and V t =, f t A l S A and ts assocated vectorx t of k-ndcator varables equals the vector Ω l. Snce τ /2 τ t= V t, =τ /2 Ω {} R τ π =Z τ, =,...,q, θ π θ θ n A R l,τ l= Ω l π A l A π l θ θ θ t follows that Z τ = τ /2 τ /2 τ t= V t =τ /2 =Z τ. τ t= V t, and M τ R τ τ M τ R τ n/nπ θ n/nπ θ 22
From the defnton ofv t, and V t we have that { } Pr V t, = π θ π θ / θ = n/nπ θ, Ω {}, { } Pr V t, = { Pr V t, = π A l θ π A l θ / θ and { Pr V t = and { } Pr V t = n/nπ θ / = n/n =,...,q, = n/nπ θ, =,...,q, } =/Nπ A l θ, Ω l, =,...,q, l =,...,n, π θ n/n and } n/nπ θ = n/nπ θ ; therefore, the epected values of the varablesv t, and V t are = E E V t V t, Ω {} because of 22. Thus, E varances are V n l= π θ / θ n/n π θ Ω l π Al θ / θ /N =, =,...,q, = n/n π θ n/n n/nπ θ = V t, V t = and E V t = n/n π θ N n l= Ω l π A l θ =, t =,...,τ. Furthermore, ther Ω {} π π θ θ 2 θ θ 2, =,...,q, π A l θ and V V t 2 n/nπ = n/n π θ θ n/n n/nπ θ = n/nπ θ, n/nπ θ 23
and ther covarances are Cov V t,,v t, = n N and Cov N n l= π θ Ω l π A l Ω {} π A π l θ θ θ V t,v t, = n π N θ π Ω {} π θ θ θ π θ θ π A l θ,, =,...,q θ,, θ N θ n l= Ω l π A l θ θ Therefore, the varance-covarance matr ofv t sψ. Fnally, snce the V t,v t, t =,...,τ, are ndependent and dentcally dstrbuted random vectors, by the central lmt theorem t follows that Z τ,z τ = τ /2 τ τ /2 t= V t,v t D Z,Z N q q, Ψ a Thus,Z τ D Z N q,ψ and Zτ D Z N,a. Consequently by 3 ˆθτ θ D Ψ Z N q,ψ as ˆΨ P Ψ. At last, from 28 and the prevous results ˆτ τ D τ { Z q a τ /2 = n/nπ θ n/nπ θ Z whereψ Z s the-th element ofψ Z and σ 2 = n/n n/nπ θ π θ = a Ψ Z } π θ π θ Ψ Z N,σ 2, Ψ n/n π θ π n/nπ θ =.. θ. 4.4 Consstency of the UMLE and CMLE ofτ,θ To prove the consstency of the UMLE and CMLE we wll use condton 5 and the followng nequalty of nformaton theory: If a and b are convergent seres of postve numbers such that a b, then a logb a loga, and the equalty s attaned f and only fa = b. See Rao 973, p. 58. 24
4.4. Consstency of the UMLE Let us frst consder get that l ˆτ U, n = Ω {} ˆθ U. Usng 3 and 6 and the defnton of the UMLE U ˆθ = { R τ Ω {} R A,τ Ω R τ π A ˆθU π ˆθU / π L MULT { π θ / π θ } n = ˆθU } ˆτ U, ˆτ U L 2 ˆτ U, L MULT τ L 2 τ,θ C = l τ,θ, R A,τ π A θ Ω U ˆθ we U ˆθ C where C depends only on observable varables. Snce L MULT ˆτ U and L 2 ˆτ U, ˆθ U are nonpostve we have that Ω {} Ω {} R τ R τ R τ R τ π π ˆθU ˆθU π θ π θ n = n = M τ R τ M τ R τ A R,τ Ω M τ A R,τ Ω M τ π A π A θ ˆθU L MULT τ /R τ L 2 τ,θ /R τ. 3 Now, snce = Ω {} R τ R τ = Ω {} π π ˆθU ˆθ U and = Ω R A,τ M τ ˆθU π A Ω = =,...,n,, usng n tmes the prevously ndcated nformaton theory nequalty we have that Ω {} Ω {} R τ R τ R τ /Rτ R τ R τ π π ˆθU ˆθU n = M τ R τ n = M τ R τ Ω R A,τ M τ Ω R A,τ M τ R A,τ /M τ π A ˆθU.32 25
Thus, by 3 and 32 we get that Ω {} n = Ω {} R τ R τ M τ R τ R τ R τ π ˆθU A R,τ Ω M τ π A θ R A,τ /M τ / π R /R τ πa π θ / π R /R τ ˆθU ˆθU R A,τ /M τ θ n = M τ R τ A R,τ Ω M τ L MULT τ /R τ L 2 τ,θ /R τ. 33 From the uncondtonal dstrbutons ofm τ,m τ andr τ,r A,τ andr A,τ n Subsecton 4., t follows that R τ /R τ P π θ / π θ, R A,τ π A θ andmτ /R τ P /{N n π θ ndcated /M τ }. Therefore, the frst two summands of the last term of the double nequalty 33 converges to zero n probablty, In addton, snce R τ /τ P π θ, and from well known results of large devatons theory see Varadhan, 28, we have that for the bnomal probabltyl 2 τ,θ : L 2 τ,θ R τ R τ { = τ M τ R τ τ R τ R τ τ M τ /τ M τ π θ R τ /τ M τ } π θ P π θ π θ =, π θ and for the multnomal probabltyl MULT τ : { n L MULT τ = τ M τ M τ /τ τ /N τ R τ o p P R τ { n = = N n/n τ M τ o R τ p τ M τ τ τ R τ τ M τ P } τ M τ /τ n/n } { } / π θ n/n =. The prevous results mply that the last term of the double nequalty 33 converges to zero n probablty, and consequently so does the mddle term. 26
Thus, { as M τ Ω {} π = N n Ω {} π θ θ R τ R τ n M τ = R τ Ω {} n = Ω π θ π Ω {} π A R,τ Ω M τ π θ θ π n ˆθU / π ˆθU π θ / π θ π A θ = Ω ˆθ U / π ˆθ U R τ /R τ πa ˆθU R A,τ /M τ π θ π θ R τ π R τ π A θ R N n π θ N n π θ π ˆθU / π ˆθU π θ / n ˆθU A R τ R τ /R τ π A = Ω π πa π A ˆθU θ / π R τ /R τ πa θ ˆθU ˆθU R A,τ /M τ θ R A,τ /M τ π A θ } / R τ /R τ and π A ˆθU P / R A,τ / are bounded as τ otherwse the mddle term of the nequalty 33 would not{ converge to zero. } Fnally, condton 5 mples that for any δ > we have that ˆθU Pr θ U P δ, that s, ˆθ θ. Straghtforward results of the prevous one are the followng: π ˆθU P π θ, ˆθU Ω, and π A assumed to be contnuous functons ofθ. P π A θ, Ω, =,...,n, as π θ and π A θ are Wth respect to ˆτ U, from epresson 5 we have that the dfference { between ˆτ U and M τ R τ / n/nπ ˆθU s less than. Thus, ˆτ U M τ R τ / } n/nπ ˆθU / τ = ˆτ U /τ M τ R τ /τ / n/n 27
π ˆθU P, and snce the second term of the last dfference converges to n probablty so does ˆτ U /τ. 4.4.2 Consstency of the CMLE By the defnton of the CMLE L ˆθ C L ˆθ C R τ ˆθ C, we have that = Ω {} n = Ω {} n = R τ R τ M τ R τ R τ R τ M τ R τ π π A R,τ Ω M τ π θ ˆθC ˆθC π θ A R,τ Ω M τ = L θ L θ, R τ π A ˆθC π A θ C wherec depends only on observable varables. Usng the same procedure as that used n the case of the UMLE ˆθ U C we wll get the double C U nequalty 33 but n terms of ˆθ nstead of ˆθ and wthout the termsl MULT τ /R τ and L 2 τ,θ /R τ C P. Consequently, we wll also have that ˆθ θ, π ˆθC P π θ, Ω, πa ˆθ C P π A θ, Ω, =,...,n, and ˆτ C /τ P, where the last result s obtaned by usng epresson 8 and the same arguments as those used to prove that ˆτ U /τ P. 4.5 Asymptotc dstrbutons of the UMLEs and CMLEs ofτ andθ 4.5. Asymptotc multvarate normal dstrbuton of the UMLE ofτ,θ We wll prove the asymptotc multvarate normal dstrbuton of τ /2 ˆτ U,τ /2 ˆθ U by provng that ths estmator satsfes the condtons of Theorem. Condton was already proved n the prevous secton. From epresson 5 t follows that ˆτ U, ˆθ U satsfes condton. Fnally, by the defnton of the UMLEs we have that condton s also satsfed. Thus, by Theorem, τ /2 ˆτ U τ,τ /2 ˆθU θ D N q,σ. Ths result 28
mples thatτ /2 ˆτ U ˆθU D τ N,σU 2 and τ/2 θ D N q,σ 22, where σ 2 U = Σ 22 = n/n n/nπ θ { π θ n/n n/nπ θ π θ Σ 22 π θ }, 34 Σ 22 n/n π π θ n/nπ θ θ π θ π θ s the gradent of π θ evaluated at θ and Σ Σ obtaned by removng ts frst row and frst column. 35 22 s the q q submatr of U 4.5.2 Asymptotc multvarate normal dstrbuton of the CMLE ˆθ and asymptotc normal dstrbuton of the CMLE ˆτ C The CMLE τ /2 ˆτ C,τ /2 ˆθ C does not have an asymptotc multvarate normal dstrbuton snce ths estmator { does not satsfy condton } of Theorem. To see ths, notce that ˆθC ˆθC by 7 t follows that L L / θ =. Therefore, τ /2 θ l ˆτ C, C ˆθ =τ /2 θ π =τ /2 L 2 ˆτ C, ˆθC θ ˆτC C ˆθ M τ R τ π ˆθC By usng epresson 8 and after some algebrac steps we get that ˆθC τ /2 l ˆτ C C, ˆθ = π τ /2 θ ˆτ C M τ R τ / τ /2 π θ n/nπ ˆθ C M τ /τ n/n π n/nπ ˆθC ˆθC π ˆθC R τ π R τ ˆθC,. /τ n/n. 36 ˆθC 29
From 8 and the fact that π ˆθC P π θ, t follows that the order of magntude of the frst term n the curly brackets of 36 so p τ /2. On the other hand, sncem τ /τ = n/no p τ /2,R τ /τ = n/n π θ O p τ /2 and, as we wll show C n the net paragraph, ˆθ = θ O p τ /2, t follows that the order of the second term n the curly brackets of 36 so p ; therefore 36 does not converge to zero n probablty. Nevertheless, although τ /2 ˆτ C τ,τ /2 ˆθC θ does not have an asymptotc multvarate normal dstrbuton, τ /2 ˆθ C θ does have. To prove ths, we wll show that condtons and of Theorem 2 are satsfed. In the prevous secton we proved C C ˆθ ˆθ satsfes condton. Thus that satsfes condton, and from 7 we have that by Theorem 2,τ ˆθ /2 C θ D N q,ψ. Now, τ /2 ˆθ C ˆθC θ ˆτ C τ has also an asymptotc normal dstrbuton because n addton that satsfes condtons and of Theorem 2, ˆτ C satsfes condton. Thus by Theorem 2, τ /2 ˆτ C D τ N,σC 2, whereσ2 C s gven by 2. It s worth notng that the asymptotc margnal dstrbutons of τ /2 ˆτ C τ and τ /2 are not the same as those of τ /2 ˆτ U τ and τ /2. To show ths, we wll frstly prove that Ψ = Σ 22 π θ n/n π θ ˆθU θ π θ π θ, 37 whereψ s theq q matr defned n the statement of Theorem 2 andσ 22 s theq q submatr of the matr Σ, defned n the statement of Theorem, obtaned by removng ts frst row and frst column. Snce π θ = π θ / π θ, t follows that π θ θ = = π θ / θ π θ π θ θ θ π θ π θ / θ 2 π θ π π θ 2 π θ π θ θ. 3
Then n/n π θ Ω {} = n/n π θ π θ 2 π θ = n/n Ω {} θ π θ = n/n π Ω {} π π θ θ π θ θ Ω {} π n/n π π θ θ θ Ω {} π θ θ θ n/n π Ω {} π θ π π θ θ π θ π θ π θ π θ π θ π θ θ θ θ θ π θ θ π θ θ θ θ π θ θ θ π θ π θ θ n/n π Ω {} π θ θ π θ θ π θ 2 π θ π θ π θ θ 2 n/n π θ θ π θ π θ θ n/n 2 π θ θ θ π θ θ = n/n = n/n Ω θ π θ Ω {} π π θ π θ θ θ. π π θ θ θ θ π θ θ π θ θ Therefore, from the defntons ofψ and Σ 22 we have that and 37 s proved. Ψ = Σ, 22, n/n π θ π 3 n/n π π θ θ θ n/n π θ π θ π θ θ θ π θ θ π θ θ,
From 34 and 37 t follows thatσ 2 C σ2 U, and henceτ /2 ˆτ U τ andτ /2 ˆτ C τ do not have the same asymptotc normal dstrbuton. In addton, 35 and 37 mply that Ψ Σ 22, and consequently that τ /2 ˆθU θ and τ /2 ˆθC θ do not have the same asymptotc normal dstrbuton. Notce also that even though the asymptotc margnal dstrbutons of the UMLEs and CMLEs of τ and θ are not the same, from 37 t follows that fn/n were small enough so that n/nπ θ π θ, thenψ Σ 22 and ther asymptotc margnal dstrbutons would be very smlar to each other. 4.6 Asymptotc propertes of uncondtonal and condtonal mamum lkelhood estmators of τ 2,θ 2 The uncondtonal and condtonal mamum lkelhood estmators of τ 2,θ 2 are eactly the same as those used n capture-recapture studes. Sanathanan 972 assumed condtons smlar to -4 and 6 and proved the followng results: ˆθ U 2 P θ C 2 and ˆθ 2 P θ 2 as τ 2. ˆτ U 2 /τ P 2 and ˆτ C τ /2 2 ˆτ U 2 τ 2 2 /τ 2 P as τ 2.,τ /2 2 θ 2 D N q2,σ 2 as τ 2, ˆθU 2 θ 2 D N q2,σ 2 and τ /2 2 ˆτ C 2 τ 2,τ /2 2 ˆθC 2 whereσ 2 s the nverse of theq 2 q 2 matrσ 2 defned by Σ 2 π, = 2 θ 2 /π 2 θ 2, Σ 2 = Σ, 2 /π, = 2 θ 2 π 2 θ 2 / θ2, =,...,q 2, Σ 2 = Σ, 2 =, /π 2 θ 2 π 2 θ 2/ θ 2 π 2 θ 2/ θ 2, Ω, =,...,q 2, and whch s assumed to be a non-sngular matr. Because the proofs of these results are eactly the same as those gven by Sanathanan 972, we wll omt them. It s worth notng that unlke the CMLE τ /2 ˆτ C,τ /2 ˆθ C, the estmator τ /2 2 ˆτ C 2,τ /2 2 ˆθ C 2 does have an asymptotc multvarate normal dstrbuton. 32
The prevous results mply that τ /2 2 ˆτ U D 2 τ 2 N,σ2 2 N,σ2 2, where σ 2 2 = π 2 θ 2 { π 2 θ 2 π 2 θ 2 π 2 θ 2 π 2 θ 2 π 2 θ 2 π 2 and τ /2 2 π 2 θ 2 Σ 2 22 θ 2 π 2 ˆτ C 2 τ 2 D θ 2, where π 2 θ 2 s the gradent ofπ2 θ 2 evaluated atθ 2 andσ 2 22 s theq 2 q 2 submatr ofσ 2 obtaned by removng ts frst row and frst column. 4.7 Consstency and asymptotc normalty of the uncondtonal and condtonal mamum lkelhood estmators of τ = τ τ 2 The UMLE and CMLE of τ = τ τ 2 were defned n Subsecton 3.3.3 by ˆτ U = ˆτ U ˆτ U 2 and ˆτ C = ˆτ C ˆτ C 2. From assumptons A and B and the prevous results we have that ˆτ U /τ = τ /τ ˆτ U /τ τ 2 /τ ˆτ U P 2 /τ 2 α α 2 =, as τ and τ 2. Smlarly, ˆτ C /τ P as τ and τ 2. Furthermore, τ /2ˆτ U τ =τ/τ /2 τ /2 ˆτ U τ τ/τ 2 /2 τ /2 2 ˆτ U D 2 τ 2 N,σU 2, as τ and τ 2, where σ 2 U = α σ 2 U α 2σ 2 2. Lkewse, τ /2ˆτ C τ D N,σ 2 C, whereσ2 C = α σ 2 C α 2σ 2 2. 5 Estmaton of the matrcesσ k andψ Although estmates of Σ k, k =,2, and Ψ can be obtaned by replacng the parameters θ k and θ by ther respectve estmates n the epressons for these matrces, ths procedure requres the computaton of sums of 2 n terms. Ths s not a problem f n s small, but f n s large enough, say greater than or equal to2, the number of these terms s very large and the calculaton of the estmates ofσ k and Ψ could be computatonally epensve. A procedure that requres a much smaller number of calculatons s based on estmates of the vectors V k t, t =,...,τ k, k =,2. Vectors V t s were defned n the proofs of Theorems and 2, whereas vectors V 2 t s are defned n Sanathanan 972 and we wll gve ther defnton later n ths secton. As was shown n the proofs of Theorem and 2, the vectors V t s are ndependent and equally dstrbuted wth mean vector equal to the vector zero and covarance matr equal to Σ n the case of Theorem, and Ψ n the case of Theorem 2. The same result holds n the case of the vectorsv 2 t s, but the covarance matr s Σ 2. Therefore, the sample covarance matr of the vectors V k t s s an estmate of ther covarance matr Σ k orψ based only onτ k observatons. 33
To mplement ths procedure we need to estmate the V k t s they are unknown because depend on θ k and τ k. In the case of the V t s defned n Theorem an estmate ˆV t of V t could be obtaned by replacng θ by ˆθ U n the epresson for V t, and τ could be estmated by ˆτ U. In the case of the V t s defned n Theorem 2 estmates of V t could be obtaned by replacng θ by ˆθ C n the epresson for V t, and τ could be estmated by ˆτ C. Estmates of V 2 t s and τ 2 could be obtaned as n the case of Theorem, and n ths stuaton both UMLE and CMLE could be used. Thus, once ˆτ k and the vectors ˆV k t s are obtaned, ther sample covarance matr can be computed and used as an estmate ofσ k or Ψ. The vectorsv 2 t = V 2 t,,...,v 2 t,q 2, t =,...,τ 2, are defned as follows: a V 2 t, = and V 2 t, = π2 θ 2 π 2 θ 2 / θ2, =,...,q 2, f the vector X 2 t of k-ndcator varables assocated wth the t-th element n U 2 equals the vector Ω {}; b V 2 t, = π 2 θ 2 / π 2 θ 2 and V 2 t, = π2 θ 2 π 2 θ 2/ θ 2, =,...,q 2, f the vectorx 2 t of k-ndcator varables assocated wth thet-th element nu 2 equals the vector Ω. 6 Conclusons Whenever we want to apply the results that we have obtaned n ths research to an actual stuaton we need to determne whether or not the assumed condtons are reasonably well satsfed by those observed n the actual scenaro. In partcular we have assumed that the numbers M s of people found n the sampled stes follow a multnomal dstrbuton wth homogeneous cell probabltes and that them s go to nfnty whle the number of stesnn the sample and N n the frame are fed. These assumptons mply that n the actual scenaro them s should be relatvely large and not very varable. However, we do not know how large they should be so that the results can be safely used. Therefore, Monte Carlo studes are requred to assess the relablty of the asymptotc results under dfferent scenaros wth fnte samples and populatons. In addton, although we have assumed a general parametrc model for the k-probabltes whch allows the possblty that the parameter depends or not on the sampled stes, the model precludes that the probabltes depend on them s as they go to nfnty. Furthermore, ths assumpton assures that the estmators ofτ andτ 2 be ndependent and not only condtonally ndependent gven them s. An alternatve asymptotc framework to the one consdered n ths work s to assume that the numbers of stes n n the sample and N n the frame go to nfnty whereas the M s are fed. However, ths would nvolve deag wth multnomal dstrbutons wth nfnte numbers of cells. An approach that could be used to derve asymptotc propertes of estmators under ths framework s the one consdered by Rao 958 who derved asymptotc propertes of a mamum lkelhood estmator of a parameter on whch depend the cell probabltes of 34
a multnomal dstrbuton wth nfnte number of cells. However, ths s a topc of a future research. Acknowledgements Ths research was partally supported by Grant PIFI-23-25-73-.4.3-8 from the Secretaría de Educacón Públca to Unversdad Autónoma de Snaloa. References Agrest, A. 22. Categorcal Data Analyss, Second edton. New York: Wley. Brch, M.W. 964. A new proof of the Pearson-Fsher theorem. Annals of Mathematcal Statstcs 35:78 824. Bshop, Y.M.M., S.E. Fenberg, and P.W. Holland. 975. Dscrete Multvarate Analyss: Theory and Practce. Cambrdge, MA: MIT Press. Coull, B. A., and A. Agrest. 999. The use of med logt models to reflect heterogenety n capture-recapture studes. Bometrcs 55:294 3. Fél-Medna, M.H., and P.E. Monardn. 26. Combnng k-tracng sampg and cluster sampg to estmate the sze of hdden populatons: a Bayesan asssted approach. Survey Methodology 32:87 95. Fél-Medna, M.H., P.E. Monardn, and A.N. Aceves-Castro. 25. Combnng ktracng sampg and cluster sampg to estmate the sze of a hdden populaton n presence of heterogeneous k-probabltes. Survey Methodology. To appear. Fél-Medna, M.H., and S.K. Thompson. 24. Combnng cluster sampg and ktracng sampg to estmate the sze of hdden populatons. Journal of Offcal Statstcs 2:9 38. Feller, W. 968. An Introducton to Probablty Theory and ts Applcatons, Thrd edton. Volume. New York: Wley. Johnston, L.G., and K. Sabn. 2. Sampg hard-to-reach populatons wth respondent drven sampg. Methodologcal Innovatons One 5 2: 38 48. Kalton, G. 29. Methods for oversampg rare populatons n socal surveys. Survey Methodology 35:25 4. Magnan, R., K. Sabn, T. Sadel, and D. Heckathorn. 25. Revew of sampg hard-toreach populatons for HIV survellance. AIDS 9:S67 S72. Rao, C.R. 958. Mamum lkelhood estmaton for the multnomal dstrbuton wth nfnte number of cells. Sankhyā: The Indan Journal of Statstcs 2:2 28.. 973. Lnear Statstcal Inference and ts Applcatons, Second edton. New York: Wley. 35
Sanathanan, L. 972. Estmatng the sze of a multnomal populaton. Annals of Mathematcal Statstcs 43:42 52. Spreen, M. 992. Rare populatons, hdden populatons and k-tracng desgns: what and why? Bulletn de Méthodologe Socologque 36:34 58. Thompson, S.K., and O. Frank. 2. Model-based estmaton wth k-tracng sampg desgns. Survey Methodology 26:87 98. Varadhan, S.R.S. 28. Large devatons. The Annals of Probablty 2:397 49. 36