Problem 3 [26 points]: Consider multiple linear regression with 3 explanatory variables (EVs) X, X2 and X3. Two hypothesis testings were performed on models with selected EVs, and the results were summarized by the two ANOVA tables below:
Ho: E(Y | X = x) = B
Ho: E(Y | X = x) = Bo + B1x + B2x2
Ho: E(Y | X = x) = B + B1x + B2x2 + B3x3
ANOVA Table
Source MS P-value
Regression 210.590 105.295 92.229 <1e-16
Residuals 52.517 1.142
Total 263.107
ANOVA Table
Source_ MS P-value
Regression 141.740 141.740 125.385 1.332e-14
Residuals 50.870 130
Total 192.610
It's known that the sample correlation between y and each of the X variables are 78.32%, 29.84% and 17.15% respectively: That is, Corr(y,x1) = 0.7832, Corr(y,x2) = 0.2984 and Corr(y,x3) = 0.1715.
(a) [12 points] Replicate the table below and fill in ALL the missing values (in at least 4 significant figures). (df and RSS of Model 5: E(Y | X = x) = B + B1x + B2x2 have already been included in the table)
Model Explanatory Variable(s) Null (No EV, constant only) RSS
1
2
3
4
5
6
x2
x3
x1,x2
x1,x3
x2,x3
x1,x2,x3
46
58.493
(b) [14 points] Test the hypothesis on the following mean functions:
Ho: E(Y | X = x) = Bo + B1x
H1: E(Y | X = x) = B + B1x + B2x2 + B3x3
You should set up the appropriate ANOVA table and follow the 4 steps of hypothesis testing as Ch3 page 49. (The p-value can be obtained from R command like "> 1-pf(Fo, df1, df2)" for the right-hand tailed probability of Fdf1, df2):