00:01
Okay, so we derive the mle.
00:05
Assume that we have m random vectors, each of size p, and their size is where each random vectors can be interpreted as an observation or data points across p variables.
00:27
If each x of i are iid as multivariate gaussian vectors, vectors, we have x follows np, mu, and covariance matrix.
00:45
So to obtain their estimates of mu and covariance matrix, we can use the method of mle and maximize the log -likely -fit function.
00:57
Now note that by the independence of the random vectors, the joint density of the data is the product of the individual densities so that we have this is the joint.
01:32
Now taking the logarithm gives the log -likely foot function which is mu mu, sigma, given xi, log of product mfxi, xi, given, which is equal to log of product i through m, m 1 over 2 pi p over 2 sigma 1 over 2 and exponential negative 1 over 2 xi minus mu mu transpose sigma inverse xi minus mu, which is equal to sum of negative p over 2 log of 2 pi minus 1 over 2 log of determinant of sigma minus 1 over 2 times x i minus mu t sigma inverse x i minus mu so that we have negative mp over 2 log of 2 pi i minus m over 2 log of determinant of sigma minus 1 over 2, 1 through m, xi minus mu, t, sigma inverse, xi minus mu.
03:57
Now, first thing we want to do is to derive the mu hat.
04:04
But to take the derivative with respect to mu and equate to 0, we will make use of the following matrix calculus identity, which is this is equal to 2aw if w does not depend on a and a is symmetric.
04:31
Metric, so that we have i equals 1 through m, sigma inverse x i minus mu equals 0.
04:55
Now since sigma is positive definite, so that implies a mu hat is equal to 1 over m, 1 through m xi which is equal to x bar...