Problem 2. Bayes Theorem & Naïve Bayes Classifier
1) Consider a study to determine the effectiveness of a new drug against an infectious disease. There were 10000 test subjects, some of whom were given the real drug while the rest were given a placebo. At the end of the study, 65% of the test subjects recovered from the disease, of out whom half of them took the real drug. Among the test subjects who did not recover from the disease, more than half of them (55%) took the real drug. Based on this information, will taking the drug help a patient to recover from the disease? Also, find the proportion of test subjects who were given the real drug. Show your steps clearly.
2) Consider a training set with 3 features, X1, X2 and X3, for a binary classification problem. The distribution of the data set is shown in the table below.
a) Based on the information above, determine whether X1 and X2 are independent of each other.
b) Determine whether X1 and X2 are conditionally independent of each other given the class.
c) Compute the class conditional probabilities P(X1 = 1 | +), P(X1 = 1 | -), P(X2 = 1 | +), P(X2 = 1 | -), P(X3 = 1 | +), and P(X3 = 1 | -).
d) Use the class conditional probabilities given in the previous question to predict the class label of each example with the feature set given in the training set above. Use your results to compute the training error of the naïve Bayes classifier.