A researcher has data on weight, height and schooling for 450 respondents in the US National Longitudinal Survey of Youth for the year 2002. Using the data on weight and height, he computes the body mass index for each individual. If the body mass index is 30 or greater, the individual is defined to be obese. He defines a binary variable, OBESE, which is equal to 1 for the 164 obese individuals and 0 for the other 176. He wishes to investigate whether obesity is related to schooling and fits an OLS regression of OBESE on S (years of schooling), obtaining:
Pˆr(OBESE = 1) = 0.595 ? 0.021S
He also estimates a logit model, obtaining
Pˆr(OBESE = 1) = ?(0.588 ? 0.105S), (2)
where (x) is the logistic distribution function.
i) Interpret the estimates of the coefficients in (1) and outline the limitations of the LPM.
ii) Interpret the coefficients displayed in (2) in terms of odd ratios.
iii) Consider the results given in (2). Derive the marginal effect of S on the probability of being obese (as a function of S).
iv) Logit models are estimated by Maximum Likelihood. Briefly describe such estimation procedure.