1. What is the assumption of normality of the error term and why it is important
in linear regression analysis (OLS)? (Hint: what would happen if the error term
is not normally distributed for hypothesis testing purposes?)
2. If a dependent variable only take values of 0's and 1's (e.g., whether someone
is employed or not), can we assume that the error term in a linear regression
would be normally distributed? What about if the dependent variable takes
only positive or nonnegative values? Explain.
3. Would transforming a nonnegative variable such as "wages" into logs, that is,
using log(wage) instead, lead to normality in the error term?
4. Suppose that an estimated coefficient $\beta_j$ is normally distributed with mean $\beta_j$
and variance Var($\beta_j$). How can we calculate an standardized estimator so it
follows a normal distribution with mean of 0 and variance of 1?
5. Suppose that we estimate the following linear regression with a sample of
N = 250 observations:
$$y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \beta_3x_3 + u.$$
In the expression
$$\frac{\hat{\beta_j} - \beta_j}{se(\hat{\beta_j})} \sim t_{n-k-1},$$
• what is the meaning of this expression?
• What are n and k?
• Calculate the degrees of freedom for this t-student distribution.
• What would happen if n - k - 1 is small, say, less than 30?
• Suppose that the estimated $\hat{\beta_1}$ is 1.34, while its standard deviation is 0.28.
For a significance test, what would be the t-statistic to be calculated
for $\beta_1$? Based on the calculated t-statistic, would $\beta_1$ be significant or
not? Explain.
6. Explain the meaning of the significance test in general, specifying what is the
null and alternative hypothesis.
7. In hypothesis testing, what does $\alpha$ represent? Interpret the meaning of $\alpha$ =
0.05.
8. What is the difference between a one-tail vs a two-tails hypothesis test? How
different are the alternative hypothesis in each case?