10 Generalized Linear Regression
In the problems of this section
T
P
x²β = βο + Σβιτά
i=1
Problem 10.6.
This is a multiclass generalization of logistic regression known as multinomial logistic
regression. Here Y is an r.v. that assumes K values in a discrete set S = {1,... K}. Note that
the elements in S are symbols, whose meaning depends on the context of the data analysis at
hand. The training set is
Dtr = {xi, Yi}i=1
where x, is a predictor with values in (some subset of) Rd. We model the symbol probabilities
by
ln P (Y; = k | x₁) = x - ln Z.
Z is a normalization factor, the logarithm of the partition function. Z is determned by the
constraint
Show that
ΣΡ (Y₁ = k | x₁) = 1.
K
P
k=1
(P(Y₁ = 1 | x₁),... P (Y₁ = K | x₁)) = softmax (x1,...,xβκ).