Problem 4. ([S] Exercise 2.5) Suppose we are given a list of floating-point values x1, x2, ..., xn. The following quantity, known as their log-sum-exp, appears in many machine learning problems:
l(x1, x2, ..., xn) = log(e^(x1) + e^(x2) + ... + e^(xn))
1. The value p = e often represents a probability P (0,1]. In this case, what is the range of possible r's?
2. Suppose many of the xi's are very negative (xi < 0). Explain why evaluating the log-sum-exp formula as written above may cause numerical error in this case.
3. Show that for any real numbers x1, x2, ..., xn,
l(x1, x2, ..., xn) = a + log(I(x1, x2, ..., xn)).
Problem 5. ([A]) In mathematics, we use Stirling's formula to approximate factorials:
n! ≈ √(2πn) * (n/e)^n
1. Write a program to compute the absolute and relative errors in Stirling's approximation for n = 1, 2, ..., 10 (assume double precision).