Problem 5
Consider the dataset in the file data_problem5.RDS (you can read this file in R using the readRDS function).
Assume that the data comes from the model that you considered in Problem 1a:
$Y_t = \epsilon_t + 0.8 \epsilon_{t-1}, \quad t = 1, \dots, 1000$,
1
where $\epsilon_t$, $t = 0, \dots, 1000$ are iid $N(0, 1)$ variables. The file contains a data frame with two columns: index
and y. y contains the value of the process at time t, where the corresponding t is given by the index column.
You will now predict the value of the process at time 1001 given all the data:
a. Compute the covariance function of $Y_t$.
b. Use the formula for the conditional expectation of a multivariate Gaussian to compute the prediction:
$\hat{Y}_{1001} = \rho^T \Sigma_{AA}^{-1} Y_A$,
where $Y_A = (Y_1, \dots, Y_{1000})$, $\rho = (\rho_1, \dots, \rho_{1000})$, with $\rho_i = r(1001 - i)$, $\Sigma_{AA}$ has $(i, j)$th element given
by $r(|i - j|)$, and $r(.)$ is the covariance function of $Y_t$.
c. What is the value of $\hat{Y}_{1001}$? Plot the prediction together with the data.