We have a dataset D = {xn, tn} (n = 1, ..., N) and each data point Xn has a weighting factor pn ≥ 0. Thus, the sum-of-squares error function becomes:
Ep(w) = Σn {tn - w(xn)}^2.
Derive step by step an expression for the solution w that minimizes this error function.
Give two interpretations of the weighted sum-of-squares error function (i.e. what the error function measures in terms of (i) data-dependent noise variance and (ii) replicated data points).