Consider the regression model with heterogeneous regression coefficients:
Yi = b0 + b1iXi + vi
where (vi, Xi, b1i) are independently and identically distributed (i.i.d.) random variables with b1 = E(b1i).
a) Show that the model can be written as:
Yi = b0 + b1Xi + ui
where ui = (b1i - b1)Xi + vi.
b) Suppose Xi is randomly assigned, so that E[b1i | Xi] = b1 and E[vi | Xi] = 0. Show that E[ui | Xi] = 0.
c) Show that the error term ui has a conditional mean of 0 given Xi. Then show that (Xi, Yi), i = 1,...,n, are independently and identically distributed (i.i.d.) draws from their joint distribution.
d) Suppose outliers are rare, so that (ui, Xi) have finite fourth moments. Is it appropriate to use OLS and methods of Chapters 4 (Linear regression with one regressor) and 5 (regression with a single regressor: hypothesis tests & confidence intervals) to estimate and carry out inference about the average values of b0i and b1i?
e) Now suppose Xi is not randomly assigned, but E[vi | Xi] = 0, and b1i and Xi are positively correlated, so that observations with larger-than-average values of Xi tend to have larger-than-average values of b1i. Are the assumptions (1. the error term ui has a conditional mean of 0 given Xi. 2. (Xi, Yi), i = 1,...,n, are independently and identically distributed (i.i.d.) draws from their joint distribution. 3. Large outliers are unlikely, nonzero finite fourth moments) satisfied? If not, which assumption(s) is (are) violated? Will the OLS estimator of b1 be unbiased for E(b1i)?