3. Consider the linear regression model with a scalar regressor with no intercept but with endogeneity:
$Y_i = X_i\beta + e_i$, $E[X_ie_i] = \delta > 0$.
(2)
Let $Y_i$ and $X_i$ be the final exam score and the class attendance rate for student $i$, respectively. The
variables are standardized so that $E[Y_i] = E[X_i] = 0$ and $var(Y_i) = var(X_i) = 1$.
(a) Derive the probability limit of the OLS estimator. Do we under-estimate or over-estimate $\beta$?
(b) Let $Z_{1i}$ be the indicator whether the student is re-taking the course. Let $Z_{2i}$ be the transport time
to/from campus. Do you think we can use $Z_{1i}$ and $Z_{2i}$ as the instruments? Argue using the IV
validity conditions: (i) exogeneity; (ii) relevance; (iii) no redundant IV.
(c) Regardless of your answer to (b), you have decided to give it a go. Write down the formula for the
IV estimators using each IV one at a time and derive their probability limit under the IV validity
assumptions given above.
(d) What is the implication of Imbens and Angrist (1994, Econometrica) regarding your answer to (c)?
(e) Maintain the framework of Imbens and Angrist (1994). Propose a GMM-type estimator that com-
bines the two IV estimators in (c). What is your justification of the estimator?