A commercial real estate company wanted to evaluate the effects of the age ($X_1$), operating expenses and
taxes ($X_2$), vacancy rates ($X_3$), and total square footage ($X_4$) on rental rates ($Y$) for commercial
properties in a large metropolitan area. A main-effects multiple linear regression model was considered
for fitting the data. An automatic procedure was implemented for variable selection. The R printout of a
step is given below.
Step: AIC=23.97
Y ~ X4 + X1 + X2
Df Sum of Sq
RSS
AIC
<none>
98.650 23.968
+ X3
1
0.420
98.231 25.622
- X2
1
27.857 126.508 42.114
- X4
1
50.287 148.937 55.335
- X1
1
60.841 159.491 60.881
(A) (2 points) What selection procedure was used?
(B) (2 points) Which variable, if any, would you expect to be included next? Explain.
(C) (2 points) What is the AIC value for the model with predictor variables $X_1$ and $X_2$?
(D) (2 points) What is the AIC value for the model with all four predictor variables included?
(E) (2 points) Report the final subset of predictor variables included in the model.