Problem 3:
Now we will use the methods of ridge regression and lasso to solve our models.
1. The ridge regression cost function we will use is of the form
$L(X, y, a, s) = \frac{1}{n}||Xa - y||^2 + s||a||_2^2$.
We can solve the normal equations directly to obtain our model parameters using
$a = (X^TX + (s/n)I)^{-1}X^Ty$.
Write a python function `ridge(X,y,s)` which returns the model parameters as a numpy
array.
[13]: def ridge(X,y,s):
n, m=X.shape
I = np.eye(m)
alpha= np.linalg.inv(X.T@X + (s/n)*I)@X.T@y
return alpha
2. Apply the ridge regression model to the data set from problem 2. Try a few different values
of s to find one which optimizes the models' preformance. You will need to cross-validate
and compare the means of the squared errors on the testing set. Start with s = 0, 1, 2, 3
and see if you can find an s that improves the models performance over the least squares
models from the previous parts.
n []:
3. For the lasso method, we can implement a stochastic gradient descent algorithm to build
our model. We will find the best fit by minimizing the objective function
$f(a) = \frac{1}{n}||Xa - y||_2^2$ subject to the constraint $||a||_1 \leq 1$ using gradient descent on the
Lagrangian of $f(a)$. Follow these steps:
a. Define the function $L_1(a, s) = f(a) + s||a||_1$.
n []:
b. Define the vector function $\nabla L_1(a, s)$.
n[]: