A Modern Approach

Jeffrey M. Wooldridge

Chapter 2

The Simple Regression Model - all with Video Answers

Educators

Chapter Questions

Problem 1

Let kids denote the number of children ever born to a woman, and let $e d u c$ denote years of education for the woman. A simple model relating fertility to years of education is
$$k i d s=\beta_{0}+\beta_{1} e d u c+u,$$
where $u$ is the unobserved error.
i. What kinds of factors are contained in $u$ ? Are these likely to be correlated with level of
education?
ii. Will a simple regression analysis uncover the ceteris paribus effect of education on fertility? Explain.

Oluwadamilola Ameobi

Oluwadamilola Ameobi

Numerade Educator

Problem 2

In the simple linear regression model $y=\beta_{0}+\beta_{1} x+u,$ suppose that $\mathrm{E}(u) \neq 0 .$ Letting $\alpha_{0}=\mathrm{E}(u),$ show that the model can always be rewritten with the same slope, but a new intercept and error, where the new error has a zero expected value.

Oluwadamilola Ameobi

Oluwadamilola Ameobi

Numerade Educator

Problem 3

The following table contains the $A C T$ scores and the $G P A$ (grade point average) for eight college students. Grade point average is based on a four-point scale and has been rounded to one digit after the decimal.
$$\begin{array}{|ccc|}
\hline \text { Student } & G P A & A C T \\
\hline 1 & 2.8 & 21 \\
2 & 3.4 & 24 \\
3 & 3.0 & 26 \\
4 & 3.5 & 27 \\
5 & 3.6 & 29 \\
6 & 3.0 & 25 \\
7 & 2.7 & 25 \\
8 & 3.7 & 30 \\
\hline
\end{array}$$
i. Estimate the relationship between $G P A$ and $A C T$ using $0 \mathrm{LS}$; that is, obtain the intercept and slope estimates in the equation
$$\widehat{G P A}=\widehat{\beta}_{0}+\widehat{\beta}_{1} A C T$$
Comment on the direction of the relationship. Does the intercept have a useful interpretation here? Explain. How much higher is the $G P A$ predicted to be if the $A C T$ score is increased by five points?
ii. Compute the fitted values and residuals for each observation, and verify that the residuals (approximately) sum to zero.
iii. What is the predicted value of $G P A$ when $A C T=20 ?$
iv. How much of the variation in $G P A$ for these eight students is explained by $A C T$ ? Explain.

Heather Duong

Numerade Educator

Problem 4

The data set BWGHT contains data on births to women in the United States. Two variables of
interest are the dependent variable, infant birth weight in ounces (bwght), and an explanatory variable, average number of cigarettes the mother smoked per day during pregnancy (cigs). The following simple regression was estimated using data on $n=1,388$ births:
$$\widehat{b w g h t}=119.77-0.514 \mathrm{cigs}$$
i. What is the predicted birth weight when $\operatorname{cigs}=0$ ? What about when $\operatorname{cigs}=20$ (one pack per day)? Comment on the difference.
ii. Does this simple regression necessarily capture a causal relationship between the child's birth weight and the mother's smoking habits? Explain.
iii. To predict a birth weight of 125 ounces, what would cigs have to be? Comment.
iv. The proportion of women in the sample who do not smoke while pregnant is about. $85 .$ Does this help reconcile your finding from part (iii)?

Paul A.

California State Polytechnic University, Pomona

Problem 5

In the linear consumption function
$$\widehat{c o n s}=\widehat{\beta}_{0}+\widehat{\beta}_{1} \text { inc, }$$
the (estimated) marginal propensity to consume (MPC) out of income is simply the slope, $\widehat{\beta}_{1}$, while the average propensity to consume (APC) is cons/inc $=\widehat{\beta}_{0} /$inc$+\widehat{\beta}_{1}$. Using observations for 100 families on annual income and consumption (both measured in dollars), the following equation is obtained:
\begin{array}{c}
\widehat{c o n s}=-124.84+0.853 \text { inc} \\
n=100, R^{2}=0.692.
\end{array}
i. Interpret the intercept in this equation, and comment on its sign and magnitude.
ii. What is the predicted consumption when family income is $\$ 30,000 ?$
iii. With inc on the $x$ -axis, draw a graph of the estimated MPC and APC.

Oluwadamilola Ameobi

Oluwadamilola Ameobi

Numerade Educator

Problem 6

Using data from 1988 for houses sold in Andover, Massachusetts, from Kiel and Mcclain (1995) , the following equation relates housing price (price) to the distance from a recently built garbage incinerator (dist):
$$\begin{aligned}
\widehat{\log (\text {price})} &=9.40+0.312 \log (\text {dist}) \\
n &=135, R^{2}=0.162.
\end{aligned}$$
i. Interpret the coefficient on log (dist). Is the sign of this estimate what you expect it to be?
ii. Do you think simple regression provides an unbiased estimator of the ceteris paribus elasticity of price with respect to dist? (Think about the city's decision on where to put the incinerator.
iii. What other factors about a house affect its price? Might these be correlated with distance from the incinerator?

Paul A.

California State Polytechnic University, Pomona

Problem 7

Consider the savings function
$$\operatorname{sav}=\beta_{0}+\beta_{1} \operatorname{inc}+u, u=\sqrt{i n c} \cdot e$$
where $e$ is a random variable with $\mathrm{E}(e)=0$ and $\operatorname{Var}(e)=\sigma_{e}^{2} .$ Assume that $e$ is independent of inc.
i. Show that $\mathrm{E}(u | \text { inc })=0,$ so that the key zero conditional mean assumption (Assumption SLR.4) is satisfied. [Hint: If $e \text { is independent of inc, then } \mathrm{E}(e | i n c)=\mathrm{E}(e) .]$
ii. Show that $\operatorname{Var}(u | \text { inc })=\sigma_{c}^{2}$ inc, so that the homoskedasticity Assumption SLR.5 is violated. In particular, the variance of sav increases with inc. [Hint: Var(elinc) = Var(e) if $e$ and inc are independent.]
iii. Provide a discussion that supports the assumption that the variance of savings increases with family income.

Oluwadamilola Ameobi

Oluwadamilola Ameobi

Numerade Educator

Problem 8

Consider the standard simple regression model $y=\beta_{0}+\beta_{1} x+u$ under the Gauss-Markov Assumptions SLR.1, SLR.2, SLR.3, SLR.4 and SLR.5. The usual OLS estimators $\hat{\beta}_{0}$ and $\widehat{\beta}_{1}$ are unbiased for their respective population parameters. Let $\tilde{\beta}_{1}$ be the estimator of $\beta_{1}$ obtained by assuming the intercept is zero (see Section $2-6$ ).
i. Find $E\left(\tilde{\beta}_{1}\right)$ in terms of the $x_{i}, \beta_{0},$ and $\beta_{1}$. Verify that $\tilde{\beta}_{1}$ is unbiased for $\beta_{1}$ when the population intercept $\left(\beta_{0}\right)$ is zero. Are there other cases where $\tilde{\beta}_{1}$ is unbiased?
ii. Find the variance of $\tilde{\beta}_{1}$. (Hint: The variance does not depend on $\beta_{0} .$ )
iii. Show that $\operatorname{Var}\left(\tilde{\beta}_{1}\right) \leq \operatorname{Var}\left(\widehat{\beta}_{1}\right)$. [Hint: For any sample of data, $\sum_{i=1}^{n} x_{i}^{2} \geq \sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2},$ with strict inequality unless $\bar{x}=0 .]$
iv. Comment on the tradeoff between bias and variance when choosing between $\widehat{\beta}_{1}$ and $\tilde{\beta}_{1}$.

Oluwadamilola Ameobi

Oluwadamilola Ameobi

Numerade Educator

Problem 9

i. Let $\widehat{\beta}_{0}$ and $\widehat{\beta}_{1}$ be the intercept and slope from the regression of $y_{i}$ on $x_{i},$ using $n$ observations. Let $c_{1}$ and $c_{2},$ with $c_{2} \neq 0,$ be constants. Let $\tilde{\beta}_{0}$ and $\tilde{\beta}_{1}$ be the intercept and slope from the regression of $c_{1} y_{i}$ on $c_{2} x_{i} .$ Show that $\tilde{\beta}_{1}=\left(c_{1} / c_{2}\right) \widehat{\beta}_{0}$ and $\tilde{\beta}_{0}=c_{1} \widehat{\beta}_{0},$ thereby verifying the claims on units of measurement in Section $2-4 .$ [Hint: To obtain $\tilde{\beta}_{1}$, plug the scaled versions of $x$ and $y$ into ( 2.19 ). Then, use (2.17) for $\tilde{\beta}_{0}$, being sure to plug in the scaled $x$ and $y$ and the correct slope.]
ii. Now, let $\tilde{\beta}_{0}$ and $\tilde{\beta}_{1}$ be from the regression of $\left(c_{1}+y_{i}\right)$ on $\left(c_{2}+x_{i}\right)$ (with no restriction on $c_{1}$ or
$c_{2}$ ). Show that $\tilde{\beta}_{1}=\widehat{\beta}_{1}$ and $\tilde{\beta}_{0}=\widehat{\beta}_{0}+c_{1}-c_{2} \widehat{\beta}_{1}$
iii. Now, let $\hat{\beta}_{0}$ and $\hat{\beta}_{1}$ be the OLS estimates from the regression $\log \left(y_{i}\right)$ on $x_{i}$, where we must assume $y_{i}>0$ for all $i .$ For $c_{1}>0,$ let $\tilde{\beta}_{0}$ and $\tilde{\beta}_{1}$ be the intercept and slope from the regression of $\log \left(c_{1} y_{i}\right)$ on $x_{i} .$ Show that $\tilde{\beta}_{1}=\widehat{\beta}_{1}$ and $\tilde{\beta}_{0}=\log \left(c_{1}\right)+\hat{\beta}_{0}$
iv. Now, assuming that $x_{i}>0$ for all $i$, let $\tilde{\beta}_{0}$ and $\tilde{\beta}_{1}$ be the intercept and slope from the regression of $y_{i}$ on $\log \left(c_{2} x_{i}\right) .$ How do $\tilde{\beta}_{0}$ and $\tilde{\beta}_{1}$ compare with the intercept and slope from the regression
of $y_{i}$ on $\log \left(x_{i}\right) ?$

Oluwadamilola Ameobi

Oluwadamilola Ameobi

Numerade Educator

Problem 10

Let $\hat{\beta}_{0}$ and $\hat{\beta}_{1}$ be the OLS intercept and slope estimators, respectively, and let $\bar{u}$ be the sample average of the errors (not the residuals!).
i. Show that $\hat{\beta}_{1}$ can be written as $\widehat{\beta}_{1}=\beta_{1}+\sum_{i=1}^{n} w_{i} u_{i},$ where $w_{i}=d_{i} / \mathrm{SST}_{x}$ and $d_{i}=x_{i}-\bar{x}$.
ii. Use part (i), along with $\sum_{i=1}^{n} w_{i}=0,$ to show that $\widehat{\beta}_{1}$ and $\bar{u}$ are uncorrelated. [Hint: You are being asked to show that $\left.\mathrm{E}\left[\left(\widehat{\beta}_{1}-\beta_{1}\right) \cdot \bar{u}\right]=0 .\right]$
iii. Show that $\widehat{\beta}_{0}$ can be written as $\widehat{\beta}_{0}=\beta_{0}+\bar{u}-\left(\widehat{\beta}_{1}-\beta_{1}\right) \bar{x}$.
iv. Use parts (ii) and (iii) to show that $\operatorname{Var}\left(\widehat{\beta}_{0}\right)=\sigma^{2} / n+\sigma^{2}(\bar{x})^{2} / \mathrm{SST}_{x}$.
v. Do the algebra to simplify the expression in part (iv) to equation (2.58)
[Hint: $\left.\operatorname{SST}_{x} / n=n^{-1} \sum_{i=1}^{n} x_{i}^{2}-(\bar{x})^{2} \cdot\right]$

Oluwadamilola Ameobi

Oluwadamilola Ameobi

Numerade Educator

Problem 11

Suppose you are interested in estimating the effect of hours spent in an SAT preparation course (hours) on total SAT score (sat). The population is all college-bound high school seniors for a particular year.
i. Suppose you are given a grant to run a controlled experiment. Explain how you would structure the experiment in order to estimate the causal effect of hours on sat.
ii. Consider the more realistic case where students choose how much time to spend in a preparation course, and you can only randomly sample sat and hours from the population. Write the population model as
$$\text {sat}=\beta_{0}+\beta_{1} \text {hours}+u$$
where, as usual in a model with an intercept, we can assume $\mathrm{E}(u)=0 .$ List at least two factors contained in $u$. Are these likely to have positive or negative correlation with hours?
iii. In the equation from part (ii), what should be the sign of $\beta_{1}$ if the preparation course is effective?
iv. In the equation from part (ii), what is the interpretation of $\beta_{0} ?$

Oluwadamilola Ameobi

Oluwadamilola Ameobi

Numerade Educator

Problem 12

Consider the problem described at the end of Section $2-6$, running a regression and only estimating
an intercept.
i. Given a sample $\left\{y_{i}: i=1,2, \ldots, n\right\},$ let $\tilde{\beta}_{0}$ be the solution to
$$\min _{b_{0}} \sum_{i=1}^{n}\left(y_{i}-b_{0}\right)^{2}$$
Show that $\tilde{\beta}_{0}=\bar{y},$ that is, the sample average minimizes the sum of squared residuals. (Hint:
You may use one-variable calculus or you can show the result directly by adding and subtracting $\bar{y}$ inside the squared residual and then doing a little algebra.)
ii. Define residuals $\tilde{u}_{i}=y_{i}-\bar{y} .$ Argue that these residuals always sum to zero.

Rashmi Sinha

Numerade Educator

Problem 13

Let $y$ be any response variable and $x$ a binary explanatory variable. Let $\left\{\left(x_{i}, y_{i}\right): i=1, \ldots, n\right\}$ be a sample of size $n$. Let $n_{0}$ be the number of observations with $x_{i}=0$ and $n_{1}$ the number of observations with $x_{i}=1 .$ Let $\bar{y}_{0}$ be the average of the $y_{i}$ with $x_{i}=0$ and $\bar{y}_{1}$ the average of the $y_{i}$ with $x_{i}=1$.
i. Explain why we can write
$$n_{0}=\sum_{i=1}^{n}\left(1-x_{i}\right), n_{1}=\sum_{i=1}^{n} x_{i}.$$

Show that $\bar{x}=n_{1} / n$ and $(1-x)=n_{0} / n .$ How do you interpret $\bar{x} ?$
ii. Argue that
$$\bar{y}_{0}=n_{0}^{-1} \sum_{i=1}^{n}\left(1-x_{i}\right) y_{i}, \bar{y}_{1}=n_{1}^{-1} \sum_{i=1}^{n} x_{i} y_{i}.$$
[Hint: Write $\left.y_{i}=\left(1-x_{i}\right) y_{i}+x_{i} y_{i} .\right]$
iv. Show that when $x_{i}$ is binary,
$$n^{-1} \sum_{i=1}^{n} x_{i}^{2}-(\bar{x})^{2}=\bar{x}(1-\bar{x}).$$
[Hint: When $\left.x_{i} \text { is binary, } x_{i}^{2}=x_{i} .\right]$
V. Show that
$$n^{-1} \sum_{i=1}^{n} x_{i} y_{i}-\bar{x} \bar{y}=\bar{x}(1-\bar{x})\left(\bar{y}_{1}-\bar{y}_{0}\right).$$
vi. Use parts (iv) and (y) to obtain (2.74).
vii. Derive equation (2.73).

Oluwadamilola Ameobi

Oluwadamilola Ameobi

Numerade Educator

Problem 14

In the context of Problem 2.13 , suppose $y_{i}$ is also binary. For concreteness, $y_{i}$ indicates whether worker $i$ is employed after a job training program, where $y_{i}=1$ means has a job, $y_{i}=0$ means does not have a job. Here, $x_{i}$ indicates participation in the job training program. Argue that $\widehat{\beta}_{1}$ is the difference in employment rates between those who participated in the program and those who did
not.

Victor Salazar

Numerade Educator

Problem 15

Consider the potential outcomes framework from Section 2.7 a. where $y_{i}(0)$ and $y_{i}(1)$ are the potential outcomes in each treatment state.
i. Show that if we could observe $y_{i}(0)$ and $y_{i}(1)$ for all $i$ then an unbiased estimator of $\tau_{\text {ate }}$ would be
$$n^{-1} \sum_{i=1}^{n}\left[y_{i}(1)-y_{i}(0)\right]=\bar{y}(1)-\bar{y}(0).$$
This is sometimes called the sample average treatment effect.
ii. Explain why the observed sample averages, $\bar{y}_{0}$ and $\bar{y}_{1}$, are not the same as $\bar{y}(0)$ and $\bar{y}(1)$ respectively, by writing $\bar{y}_{0}$ and $\bar{y}_{1}$ in terms of $y_{i}(0)$ and $y_{i}(1),$ respectively.

Oluwadamilola Ameobi

Oluwadamilola Ameobi

Numerade Educator

Problem 16

In the potential outcomes framework, suppose that program eligibility is randomly assigned but participation cannot be enforced. To formally describe this situation, for each person $i, z_{i}$ is the eligibility indicator and $x_{i}$ is the participation indicator. Randomized eligibility means $z_{i}$ is independent of $\left[y_{i}(0), y_{i}(1)\right]$ but $x_{i}$ might not satisfy the independence assumption.
i. Explain why the difference in means estimator is generally no longer unbiased.
ii. In the context of a job training program, what kind of individual behavior would cause bias?

Tyler Tebbs

Numerade Educator

Problem 17

In the potential outcomes framework with heterogeneous (nonconstant) treatment effect, write the error as$$u_{i}=\left(1-x_{i}\right) u_{i}(0)+x_{i} u_{i}(1).$$Let $\sigma_{0}^{2}=\operatorname{Var}\left[u_{i}(0)\right]$ and $\sigma_{1}^{2}=\operatorname{Var}\left[u_{i}(1)\right] .$ Assume random assignment.
i. Find $\operatorname{Var}\left(u_{i} | x_{i}\right)$.ii. When is $\operatorname{Var}\left(u_{i} | x_{i}\right)$ constant?

Rashmi Sinha

Numerade Educator

Problem 18

Let $x$ be a binary explanatory variable and suppose $P(x=1)=\rho$ for $0<\rho<1$.
i. If you draw a random sample of size $n$, find the probability-call it $\gamma_{n}-$ that Assumption $\mathrm{SLR} .3$ fails. [Hint: Find the probability of observing all zeros or all ones for the $x_{i} .$ ] Argue that $\gamma_{n} \rightarrow 0$ as $n \rightarrow \infty$.
ii. If $\rho=0.5,$ compute the probablity in part (i) for $n=10$ and $n=100 .$ Discuss.
iii. Do the calculations from part (ii) with $\rho=0.9 .$ How do your answers compare with part (ii)?

Manik Pulyani

Numerade Educator