Ridge Regression
We are given a set of input training data S = {(X1, Y1), (X2, Y2), ..., (Xn, Yn)}. Let X = (X1, X2, ..., Xn)T be the entire input data matrix and Y = (Y1, Y2, ..., Yn)T be the training labels. The objective function for ridge regression is defined as follows:
arg min b ‖Y - Xw‖^2 + λ‖w‖^2
where
‖w‖^2 = w1^2 + w2^2 + ... + wn^2
To derive the closed form solution of (w1, w2, ..., wn)T, we need to solve the equation:
∇w(‖Y - Xw‖^2 + λ‖w‖^2) = 0