Download the App!

Get 24/7 study help with the Numerade app for iOS and Android! Enter your email for an invite.

Sent to:
Search glass icon
  • Login
  • Textbooks
  • Ask our Educators
  • Study Tools
    Study Groups Bootcamps Quizzes AI Tutor iOS Student App Android Student App StudyParty
  • For Educators
    Become an educator Educator app for iPad Our educators
  • For Schools

  • Home
  • Textbooks
  • Introductory Econometrics
  • Advanced Panel Data Methods

Introductory Econometrics

Jeffrey M. Wooldridge

Chapter 14

Advanced Panel Data Methods - all with Video Answers

Educators


Chapter Questions

01:27

Problem 1

C1 Use the data in RENTAL for this exercise. The data on rental prices and other variables for college
towns are for the years 1980 and $1990 .$ The idea is to see whether a stronger presence of students
affects rental rates. The unobserved effects model is
$\begin{aligned} \log \left(\text {rent}_{i t}\right)=& \beta_{0}+\delta_{0} y 90_{t}+\beta_{1} \log \left(p o p_{i t}\right)+\beta_{2} \log \left(a v g i n c_{i t}\right) \\ &+\beta_{3} p c t s t u_{i t}+a_{i}+u_{i t} \end{aligned}$
where pop is city population, avginc is average income, and pctstu is student population as a percent-
age of city population (during the school year).
(i) Estimate the equation by pooled OLS and report the results in standard form. What do you
make of the estimate on the 1990 dummy variable? What do you get for $\hat{\beta}_{p c s t u}$ ?
(ii) Are the standard errors you report in part (i) valid? Explain.
(iii) Now, difference the equation and estimate by OLS. Compare your estimate of $\beta_{\text {pctstu}}$ with that
from part (i). Does the relative size of the student population appear to affect rental prices?
(iv) Estimate the model by fixed effects to verify that you get identical estimates and standard errors
to those in part (iii).

Heather Duong
Heather Duong
Numerade Educator
03:42

Problem 2

Use CRIMEA for this exercise.
(i) Reestimate the unobserved effects model for crime in Example 13.9 but use fixed effects rather
than differencing. Are there any notable sign or magnitude changes in the coefficients? What
about statistical significance?
(ii) Add the logs of each wage variable in the data set and estimate the model by fixed effects. How
does including these variables affect the coefficients on the criminal justice variables in part (i)?
(iii) Do the wage variables in part (ii) all have the expected sign? Explain. Are they jointly significant?

Heather Duong
Heather Duong
Numerade Educator
05:38

Problem 3

For this exercise, we use JTRAIN to determine the effect of the job training grant on hours of job train-
ing per employee. The basic model for the three years is
$\begin{aligned} \text {hrsemp}_{i t}=& \beta_{0}+\delta_{1} d 88_{t}+\delta_{2} d 89_{t}+\beta_{1} g r a n t_{i t}+\beta_{2} g r a n t_{i, t-l} \\ &+\beta_{3} \log \left(e m p l o y_{i t}\right)+a_{i}+u_{i t} \end{aligned}$
(i) Estimate the equation using fixed effects. How many firms are used in the FE estimation? How
many total observations would be used if each firm had data on all variables (in particular,
hrsemp) for all three years?
(ii) Interpret the coefficient on grant and comment on its significance.
(iii) Is it surprising that grant $_{-1}$ is insignificant? Explain.
(iv) Do larger firms provide their employees with more or less training, on average? How big are
the differences? (For example, if a firm has 10$\%$ more employees, what is the change in average
hours of training?

Heather Duong
Heather Duong
Numerade Educator
04:56

Problem 4

In Example $13.8,$ we used the unemployment claims data from Papke $(1994)$ to estimate the effect of enterprise zones on unemployment claims. Papke also uses a model that allows each city to have its
own time trend:
$\log \left(u c l m s_{i t}\right)=a_{i}+c_{i} t+\beta_{1} e z_{i t}+u_{i t}$
where $a_{i}$ and $c_{i}$ are both unobserved effects. This allows for more heterogeneity across cities.
(i) Show that, when the previous equation is first differenced, we obtain
$\Delta \log \left(u c l m s_{i t}\right)=c_{i}+\beta_{1} \Delta e z_{i t}+\Delta u_{i t}, t=2, \ldots, T$
Notice that the differenced equation contains a fixed effect, $c_{i}$
(ii) Estimate the differenced equation by fixed effects. What is the estimate of $\beta_{1} ?$ Is it very
different from the estimate obtained in Example 13.8$?$ Is the effect of enterprise zones still
statistically significant?
(iii) Add a full set of year dummies to the estimation in part (ii). What happens to the estimate
of $\beta_{1} ?$

Heather Duong
Heather Duong
Numerade Educator
03:49

Problem 5

(i) In the wage equation in Example $14.4,$ explain why dummy variables for occupation might be
important omitted variables for estimating the union wage premium.
(ii) If every man in the sample stayed in the same occupation from 1981 through $1987,$ would you
need to include the occupation dummies in a fixed effects estimation? Explain.
(iii) Using the data in WAGEPAN include eight of the occupation dummy variables in the equation
and estimate the equation using fixed effects. Does the coefficient on union change by much?
What about its statistical significance?

Heather Duong
Heather Duong
Numerade Educator
04:01

Problem 6

Add the interaction term union_{it} \cdot t \text { to } the equation estimated in Table 14.2 to see if wage growth depends on union status. Estimate the equation by random and fixed effects and compare the results.

Heather Duong
Heather Duong
Numerade Educator
10:57

Problem 7

Use the state-level data on murder rates and executions in MURDER for the following exercise.
(i) Consider the unobserved effects model
$m r d r t e_{i t}=\eta_{t}+\beta_{1} e x e c_{i t}+\beta_{2} u n e m_{i t}+a_{i}+u_{i t}$
where $\eta_{t}$ simply denotes different year intercepts and $a_{i}$ is the unobserved state effect. If past executions of convicted murderers have a deterrent effect, what should be the sign of $\beta_{1} ?$ What sign do you think $\beta_{2}$ should have? Explain.
(ii) Using just the years 1990 and $1993,$ estimate the equation from part (i) by pooled OLS. Ignore
the serial correlation problem in the composite errors. Do you find any evidence for a deterrent
effect?
(iii)Now, using 1990 and 1993 , estimate the equation by fixed effects. You may use first
differencing since you are only using two years of data. Is there evidence of a deterrent effect?
How strong?
(iv) Compute the heteroskedasticity-robust standard error for the estimation in part (ii).
(v) Find the state that has the largest number for the execution variable in $1993 .$ . The variable
exec is total executions in $1991,1992,$ and $1993 .$ How much bigger is this value than the next
highest value?
(vi) Estimate the equation using first differencing, dropping Texas from the analysis. Compute the
usual and heteroskedasticity-robust standard errors. Now, what do you find? What is going on?
(vii) Use all three years of data and estimate the model by fixed effects. Include Texas in the
analysis. Discuss the size and statistical significance of the deterrent effect compared with only
using 1990 and $1993 .$

Heather Duong
Heather Duong
Numerade Educator
10:08

Problem 8

Use the data in MATHPNL for this exercise. You will do a fixed effects version of the first differencing
done in Computer Exercise 11 in Chapter $13 .$ The model of interest is
$\begin{aligned} \operatorname{math} 4_{i t}=& \delta_{1} y 94_{t}+\ldots+\delta_{5} y 98_{t}+\gamma_{1} \log \left(r e x p p_{i t}\right)+\gamma_{2} \log \left(r e x p p_{i, t-1}\right) \\ &+\psi_{1} \log \left(e n r o l_{i t}\right)+\psi_{2} \operatorname{lunch}_{i t}+a_{i}+u_{i t} \end{aligned}$
where the first available year (the base year) is 1993 because of the lagged spending variable.
(i) Estimate the model by pooled OLS and report the usual standard errors. You should include an
intercept along with the year dummies to allow $a_{i}$ to have a nonzero expected value. What are
the estimated effects of the spending variables? Obtain the OLS residuals, $\hat{v}_{i r}$
(ii) Is the sign of the lunch_{it} \text { coefficient what you expected? Interpret the magnitude of the } coefficient. Would you say that the district poverty rate has a big effect on test pass rates?
(iii) Compute a test for AR $(1)$ serial correlation using the regression $\hat{v}_{i t-1} .$ You should use the years 1994 through 1998 in the regression. Verify that there is strong positive serial correlation and discuss why.
(iv) Now, estimate the equation by fixed effects. Is the lagged spending variable still significant?
(v) Why do you think, in the fixed effects estimation, the enrollment and lunch program variables
are jointly insignificant?
(vi) Define the total, or long-run, effect of spending as $\theta_{1}=\gamma_{1}+\gamma_{2} .$ Use the substitution $\gamma_{1}=\theta_{1}-\gamma_{2}$ to obtain a standard error for $\theta_{1} .$ IHint: Standard fixed effects estimation using $\log \left(r e x p p_{i t}\right)$ and $z_{i t}=\log \left(r e x p p_{i, t-1}\right)-\log \left(r e x p p_{i t}\right)$ as explanatory variables should do it.]

Heather Duong
Heather Duong
Numerade Educator
05:34

Problem 9

The file PENSION contains information on participant-directed pension plans for U.S. workers. Some
of the observations are for couples within the same family, so this data set constitutes a small cluster
sample (with cluster sizes of two).
(i) Ignoring the clustering by family, use OLS to estimate the model
$p c t s t c k=\beta_{0}+\beta_{1}$ choice $+\beta_{2}$ prfshr $+\beta_{3}$ female $+\beta_{4} a g e$
$+\beta_{5} e d u c+\beta_{6} f i n c 25+\beta_{7} f i n c 35+\beta_{8} f i n c 50+\beta_{9} f i n c 75$
$+\beta_{10} f i n c 100+\beta_{11} f i n c l 0 l+\beta_{12}$ wealth $89+\beta_{13} s t c k i n 89$$+\beta_{14} i r a i n 89+u$
where the variables are defined in the data set. The variable of most interest is choice, which is a
dummy variable equal to one if the worker has a choice in how to allocate pension funds among
different investments. What is the estimated effect of choice? Is it statistically significant?
(ii) Are the income, wealth, stock holding, and IRA holding control variables important? Explain.
(iii) Determine how many different families there are in the data set.
(iv) Now, obtain the standard errors for OLS that are robust to cluster correlation within a family.
Do they differ much from the usual OLS standard errors? Are you surprised?
(v) Estimate the equation by differencing across only the spouses within a family. Why do the
explanatory variables asked about in part (ii) drop out in the first-differenced estimation?
(vi) Are any of the remaining explanatory variables in part (v) significant? Are you surprised?

Heather Duong
Heather Duong
Numerade Educator
09:51

Problem 10

Use the data in AIRFARE for this exercise. We are interested in estimating the model
$\begin{aligned} \log \left(\text { fare }_{i t}\right)=& \eta_{t}+\beta_{1} \text { concen }_{i t}+\beta_{2} \log \left(d i s t_{i}\right)+\beta_{3}\left[\log \left(d i s t_{i}\right)\right]^{2} \\ &+a_{i}+u_{i t}, t=1, \ldots, 4 \end{aligned}$
where $\eta_{t}$ means that we allow for different year intercepts.
(i) Estimate the above equation by pooled OLS, being sure to include year dummies. If
\Deltaconcen $=.10,$ what is the estimated percentage increase in fare?
(ii) What is the usual OLS 95 $\%$ confidence interval for $\beta_{1} ?$ Why is it probably not reliable? If you have access to a statistical package that computes fully robust standard errors, find the fully
robust 95$\%$ CI for $\beta_{1} .$ Compare it to the usual CI and comment.
(iii) Describe what is happening with the quadratic in log(dist). In particular, for what value of dist does the relationship between log(fare) and dist become positive? [Hint: Figure out the turning point
value for log(dist), and then exponentiate. Is the turning point outside the range of the data?
(iv) Now estimate the equation using random effects. How does the estimate of $\beta_{1}$ change?
(v) Now estimate the equation using fixed effects. What is the FE estimate of $\beta_{1} ?$ . Why is it fairly similar to the RE estimate? (Hint: What is $\hat{\theta}$ for RE estimation?)
(vi) Name two characteristics of a route (other than distance between stops) that are captured by $a_{i}$ Might these be correlated with concen $_{i t}$ ?
(vii) Are you convinced that higher concentration on a route increases airfares? What is your best estimate?

Heather Duong
Heather Duong
Numerade Educator
04:14

Problem 11

This question assumes that you have access to a statistical package that computes standard errors
robust to arbitrary serial correlation and heteroskedasticity for panel data methods.
(i) For the pooled OLS estimates in Table $14.1,$ obtain the standard errors that allow for arbitrary
serial correlation (in the composite errors, $v_{i t}=a_{i t}+u_{i t} )$ and heteroskedasticity. How do the
robust standard errors for educ, married, and union compare with the nonrobust ones?
(ii) Now obtain the robust standard errors for the fixed effects estimates that allow arbitrary serial
correlation and heteroskedasticity in the idiosyncratic errors, $u_{i r}$ . How do these compare with the
nonrobust FE standard errors?
(iii) For which method, pooled OLS or FE, is adjusting the standard errors for serial correlation
more important? Why?

Heather Duong
Heather Duong
Numerade Educator
10:52

Problem 12

Use the data in ELEM94 $_{-} 95$ to answer this question. The data are on elementary schools in
Michigan. In this exercise, we view the data as a cluster sample, where each school is part of a
district cluster.
(i) What are the smallest and largest number of schools in a district? What is the average number
of schools per district?
(ii) Using pooled OLS (that is, pooling across all $1,848$ schools), estimate a model relating lavgsal
to bs, lenrol, lstaff, and lunch; see also Computer Exercise 11 from Chapter $9 .$ What are the
coefficient and standard error on $b s ?$
(iii) Obtain the standard errors that are robust to cluster correlation within district (and also
heteroskedasticity). What happens to the $t$ statistic for bs?
(iv) Still using pooled OLS, drop the four observations with $b s>.5$ and obtain $\hat{\beta}_{b s}$ and its cluster-robust standard error. Now is there much evidence for a salary-benefits tradeoff?
(v) Estimate the equation by fixed effects, allowing for a common district for schools within
a district. Again drop the observations with $b s>.5 .$ Now what do you conclude about the
salary-benefits tradeoff?
(vi) In light of your estimates from parts (iv) and (v), discuss the importance of allowing teacher
compensation to vary systematically across districts via a district fixed effect.

Heather Duong
Heather Duong
Numerade Educator
13:45

Problem 13

The data set DRIVING includes state-level panel data (for the 48 continental U.S. states) from 1980 .
through $2004,$ for a total of 25 years. Various driving laws are indicated in the data set, including the alcohol level at which drivers are considered legally intoxicated. There are also indicators for "per se" laws - where licenses can be revoked without a trial- and seat belt laws. Some economics and demo-
graphic variables are also included.
(i) How is the variable totfatrte defined? What is the average of this variable in the years $1980,$ $1992,$ and 2004$?$ Run a regression of totfatre on dummy variables for the years 1981 through $2004,$ and describe what you find. Did driving become safer over this period? Explain.
(ii) Add the variables bac08, bacl0, perse, sbprim, sbsecon, sl70plus, $g d l,$ percl $4_{-} 24,$ unem, and vehicmilespc to the regression from part (i). Interpret the coefficients on $b a c 8$ and $b a c 10 .$ Do per se laws have a negative effect on the fatality rate? What about having a primary seat belt
law? (Note that if a law was enacted sometime within a year the fraction of the year is recorded
in place of the zero-one indicator.)
(iii) Reestimate the model from part (ii) using fixed effects (at the state level). How do the
coefficients on $b a c 08, b a c 10,$ perse, and sbprim compare with the pooled OLS estimates?
Which set of estimates do you think is more reliable?
(iv) Suppose that vehicmilespc, the number of miles driven per capita, increases by $1,000 .$ Using
the FE estimates, what is the estimated effect on totfatre? Be sure to interpret the estimate as if
explaining to a layperson.
(v) If there is serial correlation or heteroskedasticity in the idiosyncratic errors of the model then the standard errors in part (iii) are invalid. If possible, use "cluster" robust standard errors for the fixed effects estimates. What happens to the statistical significance of the policy variables in
part (iii)?

Heather Duong
Heather Duong
Numerade Educator
06:27

Problem 14

Use the data set in AIRFARE to answer this question. The estimates can be compared with those in
Computer Exercise $10,$ in this Chapter.
(i) Compute the time averages of the variable concen; call these concenbar. How many different
time averages can there be? Report the smallest and the largest.
(ii) Estimate the equation
Ifare $_{i t}=\beta_{0}+\delta_{1} y 98_{t}+\delta_{2} y 99_{t}+\delta_{3} y 00_{t}+\beta_{1}$ concen $_{i t}+\beta_{2} l d i s t_{i}+\beta_{3} l d i s t s q_{i}+$ $\gamma_{1}$ concenbar $_{i}+a_{i}+u_{i t}$ by random effects. Verify that $\hat{\boldsymbol{\beta}}_{1}$ is identical to the FE estimate
computed in $\mathrm{Cl} 0 .$
(iii) If you drop ldist and ldistsq from the estimation in part (i) but still include concenbar, what
happens to the estimate of $\hat{\beta}_{1}$ ? What happens to the estimate of $\gamma_{1} ?$
(iv) Using the equation in part (ii) and the usual RE standard error, test $H_{0} : \gamma_{1}=0$ against the two-
sided alternative. Report the $p$ -value. What do you conclude about RE versus FE for estimating
$\beta_{1}$ in this application?
(v) If possible, for the test in part (iv) obtatistic (and, therefore, $p$ -value) that is robust to arbitrary
serial correlation and heteroskicity. Does this change the conclusion reached in part (iv)?

Heather Duong
Heather Duong
Numerade Educator
08:43

Problem 15

Use the data in COUNTYMURDERS to answer this question. The data set covers murders and execu-
tions (capital punishment) for $2,197$ counties in the United States. See also Computer Exercise $C 16$ in Chapter $13 .$
(i) Consider the model
murdrate $_{i t}=\theta_{t}+\delta_{0} \operatorname{execs}_{i t}+\delta_{1} \operatorname{execs}_{i, t-1}+\delta_{2} \operatorname{execs}_{i, t-2}+\delta_{3} e x e c s_{i, t-3}+$ $\beta_{5}$percblack$_{i t}+$ $+\beta_{77} \operatorname{perclo} 19_{i t}$ $+\beta_{8} p e r c 2029_{i t}$ $+a_{i}+u_{i t}$
where $\theta_{t}$ represents a different intercept for each time period, $a_{i}$ is the county fixed effect, and $u_{i t}$ is the idiosyncratic error. Why does it make sense to include lags of the key variable, execs, in the equation?
(ii) Apply OLS to the equation from part (i) and report the estimates of $\delta_{0}, \delta_{1}, \delta_{2},$ and $\delta_{3},$ along with the usual pooled OLS standard errors. Do you estimate that executions have a deterrent effect on murders? Provide an explanation that involves $a_{i} .$
(iii) Now estimate the equation in part (i) using fixed effects to remove $a_{i} .$ What are the new
estimates of the $\delta_{j} ?$ Are they very different from the estimates from part (ii)?
(iv) Obtain the long-run propensity from estimates in part (ii). Using the usual FE standard errors,
is the LRP statistically different from zero?
(v) If possible, obtain the standard errors for the FE estimates that are robust to arbitrary
heteroskedasticity and serial correlation in the $\left\{u_{i t}\right\} .$ What happens to the statistical significance of the $\hat{\delta}_{j} ?$ What about the estimated LRP?

Heather Duong
Heather Duong
Numerade Educator

Get 24/7 study help with our app

 

Available on iOS and Android

About
  • Our Story
  • Careers
  • Our Educators
  • Numerade Blog
Browse
  • Bootcamps
  • Books
  • Topics
  • Test Prep
  • Ask Directory
  • Online Tutors
  • Tutors Near Me
Support
  • Help
  • Privacy Policy
  • Terms of Service
Get started