Guessers and Swots
Professor Partridge has a class of N students. Every student in the class is either a Swot or a Guesser. Swots study hard, while Guessers only guess. The proportion of Swots in the class, s, is unknown. Partridge wants to use some test results to estimate the proportion of Swots, s.
• There are N students in the class.
• Each student is a Swot with probability s, or a Guesser with probability 1 - s.
• Define ̐ = {All students}, and events S = {Swot}; G = Sᶜ = {Guesser}.
• Mixture of two distributions. On a test of 10 questions, let X be the number of correct answers a student gets. The distribution of X depends on whether the student is a Guesser or a Swot. Look carefully at the following notation. We write:
X | G ~ Binomial(10, 0.5) ; X | S ~ Binomial(10, 0.75)
• Using the Partition Theorem, you will show in Q1(a) below that:
P(X = x) = (1 - s) ⎛10⎝ (0.5)ˣ(0.5)¹⁰⁻ˣ + s ⎛10⎝ (0.75)ˣ(0.25)¹⁰⁻ˣ
⎝ x ⎠ ⎝ x ⎠
The overall distribution of X is a mixture of a Bin(10, 0.5) distribution for Guessers, and a Bin(10, 0.75) distribution for Swots, where s is the mixture probability.
• Professor Partridge wants to estimate the proportion of Swots in the class, s.
1.(a) We said above that P(X = x) = (1 - s) ⎛10⎝ (0.5)ˣ(0.5)¹⁰⁻ˣ + s ⎛10⎝ (0.75)ˣ(0.25)¹⁰⁻ˣ.
⎝ x ⎠ ⎝ x ⎠
Show this using proper probability notation. Your answer should be two lines long. The first line should be of the form P(X = x) = P(X = x | ...)P(...) + P(X = x | ...)P(...), and the second line should fill in all quantities to produce the desired result. (1)
(b) Use the formula to show that P(X = 4) = 0.21 - 0.19s. Round all decimals to 2 d.p. (1)
(c) The same procedure as in (b) using X = 8 gives: P(X = 8) = 0.04 + 0.24s.
If two students take the test (N = 2), and they score marks 4 and 8 respectively, the likelihood function is L(s; 4, 8) = P(X = 4)P(X = 8). Use the information in (b) and (c) to write down the likelihood explicitly in terms of s. Remember to give the range of values of s over which the likelihood is defined. (1)
(d) Using (c), solve dL/ds = 0 to find the maximum likelihood estimate of s when there are two students who score 4 and 8 marks respectively. For convenience, use the rounded values 0.21, 0.19, 0.04, and 0.24 from (b) and (c), not their unrounded values. (4)
Instead of maximizing the likelihood function, statisticians usually maximize the log-likelihood function. In this example, the log-likelihood is:
log (L(s; x₁, x₂)) = log ((0.21 - 0.19s)(0.04 + 0.24s)),
where log refers to the natural logarithm, logₑ. Maximizing the log-likelihood should always give the same answer as maximizing the likelihood, because log(L) is a strictly increasing function of L. Using the log-likelihood has the convenient effect of transforming awkward products, like (0.21 - 0.19s)(0.04 + 0.24s), into sums:
log (L(s; x₁, x₂)) = log(0.21 - 0.19s) + log(0.04 + 0.24s).
(e) Differentiate log (L(s; x₁, x₂)) from above, and solve the equation d log(L)/ds = 0 for s. The value of s that you get maximizes the log-likelihood. Is this the same value of s that maximized the likelihood from part (d)?
[Hint: remember that if y = log(x), then dy/dx = 1/x. In this question you need to consider something like dy/dx when y = log(a + bx).] (4)