Interpretation of simple linear regression:
Suppose you are interested in estimating the effect of hours spent in an SAT preparation course (hours) on total SAT score (sat). The population is all college-bound high school seniors for the particular year.
(i) Suppose you are given a grant to run a controlled experiment. Explain how you would structure the experiment in order to estimate the casual effect of hours on sat.
(ii) Consider the more realistic case where students choose how much time to spend in a preparation course, and you can only randomly sample sat and hours from the population. Write the population model as:
sat = β0 + β1 hours + u
Where, as usual in a model with an intercept, we can assume E (u) = 0. List at least two factors contained in u. are these likely to have positive or negative correlation with hours?
(iii) In the equation from part (ii), what should be the sign of β1 if the preparation course is effective?
(iv) In the equation from part (ii), what is the interpretation of β0?