Q5: (20 marks) The design of robust loss is very important for the deployment of machine learning in the wild. Here, we design two different loss functions as follows:
$l_1(f(X),Y) = \sqrt{3 + (Y - f(X))^2} - 5$;
$l_2(f(X), Y) = 3(Y - f(X))^2$;
$l_3(f(X), Y) = 2(Y - f(X))^3$;
Which loss function is less robust to outliers (or large noise) (10 marks) and why? (10 marks)