Title: Effects of Regularization Penalties on Estimates of w in Linear Regression
In linear regression, we can use various regularization penalties to improve the accuracy of our estimates. Let's consider three cases: p=0, p=1, and p=2. For p=0, the exponent is not present. We want to understand how these different penalties affect the estimates of w.
To analyze this, we will use a simple problem and the provided dataset ql.pkl. To load the data, we can use the following Python code:
import pickle
with open('hw3.ql.pkl', 'rb') as f:
data = pickle.load(f)
print(type(data))
print(data.keys())
print(data['X'].shape)
print(data['y'].shape)
Assuming that the response variable is distributed according to y ~ N(w, σ^2) (where no regularization penalty is needed), we need to find the maximum likelihood estimate (MLE) of w. We can write down the closed form solution for w and calculate its value.
Given X=2, we need to find the value of w for p=2. Again, we can write down the closed form solution for w and calculate its value.
For p=1 and A=1, we can use sklearn's Lasso model or Scipy function scipy.optimize.fmin to find the value of w. Write down the value of w.
For p=0 and X=1, we need to consider the L0 norm, which is not a real norm. The penalty expression is slightly different:
argmin |y-Xw+Xw|
To solve this, we need to consider all the combinatorially many cases where different components of w are set to zero and add the L0 penalty based on the number of features. In this case, there are 8 cases for 3 unknown wi. Write down the value of w.
Finally, write a paragraph describing the relation between the estimates of w in the four cases. Explain why it makes sense given the different penalties used.