Question

Problem 4. ([S] Exercise 2.5) Suppose we are given a list of floating-point values $x_1, x_2, \dots, x_n$. The following quantity, known as their \"log-sum-exp\", appears in many machine learning problems: $l(x_1, \dots, x_n) = \ln \left( \sum_{i=1}^n e^{x_i} \right)$. 1. The value $p_k = e^{x_k}$ often represents a probability $p_k \in (0, 1)$. In this case, what is the range of possible $x_k$'s? 1 2. Suppose many of the $x_i$'s are very negative ($x_k \le 0$). Explain why evaluating the log-sum-exp formula as written above may cause numerical error in this case. 3. Show that for any $a \in \mathbb{R}$, $l(x_1, \dots, x_n) = a + \ln \left( \sum_{i=1}^n e^{x_i - a} \right)$. To avoid the issues you explained in question 2, suggest a value $a$ that may improve computing $l(x_1, \dots, x_n)$. Problem 5. ([A]) In mathematics, we use Stirling's formula to approximate factorials: $n! \approx \sqrt{2\pi n} (n/e)^n$. 1. Write a program to compute the absolute and relative errors in Stirling's approximation for $n = 1, 2, \dots, 10$ (assume double precision). 2. Does the absolute error grow or shrink as $n$ increases? Does the relative error grow or shrink as $n$ increases?

          Problem 4. ([S] Exercise 2.5) Suppose we are given a list of floating-point values $x_1, x_2, \dots, x_n$. The following quantity, known as their \"log-sum-exp\", appears in many machine learning problems:
$l(x_1, \dots, x_n) = \ln \left( \sum_{i=1}^n e^{x_i} \right)$.
1. The value $p_k = e^{x_k}$ often represents a probability $p_k \in (0, 1)$. In this case, what is the range of possible $x_k$'s?
1
2. Suppose many of the $x_i$'s are very negative ($x_k \le 0$). Explain why evaluating the log-sum-exp formula as written above may cause numerical error in this case.
3. Show that for any $a \in \mathbb{R}$,
$l(x_1, \dots, x_n) = a + \ln \left( \sum_{i=1}^n e^{x_i - a} \right)$.
To avoid the issues you explained in question 2, suggest a value $a$ that may improve computing $l(x_1, \dots, x_n)$.
Problem 5. ([A]) In mathematics, we use Stirling's formula to approximate factorials:
$n! \approx \sqrt{2\pi n} (n/e)^n$.
1. Write a program to compute the absolute and relative errors in Stirling's approximation for $n = 1, 2, \dots, 10$ (assume double precision).
2. Does the absolute error grow or shrink as $n$ increases? Does the relative error grow or shrink as $n$ increases?

Problem 4. ([S] Exercise 2.5) Suppose we are given a list of floating-point values x1, x2, …, xn. The following quantity, known as their l̈og-sum-exp,̈ appears in many machine learning problems:
l(x1, …, xn) = ln( ∑i=1^n e^xi).
1. The value pk = e^xk often represents a probability pk ∈ (0, 1). In this case, what is the range of possible xk's?
1
2. Suppose many of the xi's are very negative (xk ≤ 0). Explain why evaluating the log-sum-exp formula as written above may cause numerical error in this case.
3. Show that for any a ∈ℝ,
l(x1, …, xn) = a + ln( ∑i=1^n e^xi - a).
To avoid the issues you explained in question 2, suggest a value a that may improve computing l(x1, …, xn).
Problem 5. ([A]) In mathematics, we use Stirling's formula to approximate factorials:
n! ≈√(2π n) (n/e)^n.
1. Write a program to compute the absolute and relative errors in Stirling's approximation for n = 1, 2, …, 10 (assume double precision).
2. Does the absolute error grow or shrink as n increases? Does the relative error grow or shrink as n increases?

Added by Haley H.

Question

Please give Ace some feedback