Question Four
3
Drink Water, a water drilling company engaged by USAID Zambia to sink boreholes in Nampundwe District, has indicated that an exploratory water well drilled in a particular area should strike water with a probability of 0.46.
i. Identify the operational probability distribution. [1 Mark]
ii. Find the probability that the third non-dry borehole sank comes on the fifth well drilled. [3 Marks]
iii. Calculate the mean and variance of the distribution. [3 Marks]
b. The data comes from Zambia's health survey conducted in 2018 by the Zambia Statistical Agency. The sample consisted of around 14,048 individuals representative of the Zambian population as a whole. The variables have the following interpretations:
Page 3 of 15
- sleep: minutes spent sleeping during the previous 24 hours
- gender: a dummy variable equal to 1 if the person is male
- educ: the number of completed years of education up to a maximum of 12
- higheduc: a dummy variable equal to 1 if the person has post-secondary education
- age: age in years
- age2: the square of age
You are a new statistician trying to estimate the determinants of sleep times. Answer the following questions using the regression output below.
reg sleep gender educ higheduc age age2
Source | SS df MS Number of obs = 14048
-------------+------------------------------ F(5, 14042) = 221.40
Model | 18043620.7 5 3608724.14 Prob > F = 0.0000
Residual | 14042 16299.7525 R-squared = 0.0727
-------------+------------------------------ Adj R-squared = 0.0727
Total | 246924745 14047 17579.4684 Root MSE = 127.67
------------------------------------------------------------------------------
sleep | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gender | -15.51972 2.161812 -7.18 0.000 -19.7255 -11.31394
educ | -5.156947 .3215722 -16.03 0.000 -5.787449 -4.526445
higheduc | -29.19108 3.672536 -7.95 0.000 -36.4076 -21.97456
age | -2.854019 .2580255 -11.07 0.000 -3.36125 -2.346788
age2 | .0432833 .0031449 13.76 0.000 .0371077 .0494589
_cons | 657.187 5.007259 131.25 0.000 647.3837 666.9903
------------------------------------------------------------------------------
i. Find the missing degree of freedom (df) of the regression equation above. [1 Mark]
ii. Calculate the Residual Sum of Squares (RSS) for the regression equation above. [2 Marks]
iii. Calculate and interpret the Coefficient of Determination (R-squared). [Hint: R-squared] [2 Marks]
iv. Using the Student's t-distribution, carry out the hypothesis test on the variable educ by way of the classical hypothesis testing procedure. [3 Marks]
v. Using the Confidence Interval approach, carry out the hypothesis test on the variable age by way of the classical hypothesis testing procedure. [3 Marks]
vi. Using the Snedecor's F-distribution, carry out the hypothesis test on whether the estimated model is well-specified or not. [2 Marks] [Hint: Ho: Model is well-specified, H1: Model is inappropriately specified]