Implement a Naive Bayes classifier in R or Python to apply it to the task of classifying handwritten digits. Test: https://drive.google.com/file/d/1cJ2v8iu3Ba7uw5WQ__0ciJ1NkWXEvgVj/view?usp=sharing Train: https://drive.google.com/file/d/10N4-gFhdD63gtrHAJvnnuh-XVlRgubU4/view?usp=sharing
For this problem, you cannot use an existing Naive Bayes classifier implementation package. Implement a Naive Bayes classifier in R or Python to apply it to the task of classifying handwritten digits.
Files mnist-train and mnist-test contain training and test digits, together with their ground truth labels (first column). Each row in these files corresponds to a different digit. Each image is 28x28, hence there are 784 pixels in every image. Columns 2-785 in the data files correspond to the pixel intensity; a value between 0 to 255. Column 1 corresponds to the correct label for each digit.
You should convert the pixel intensities to a single binary indicator feature (Fi) for each pixel. Specifically, if the intensity is smaller than or equal to 127, map it to zero; otherwise, map it to one.
(10 points) Estimate the priors P(class) based on the frequencies of different classes in the training set. Report the values in a table. Round to 3 decimal places.
(15 points) Estimate the likelihoods P(Fi|class) for every pixel location and for every digit class from 0 to 9. The likelihood estimate is:
P(Fi = f|class) = (Number of times pixel has value f in training examples from this class) / (Total number of training examples from this class)
Note that you have to smooth the likelihoods to ensure that there are no zero counts. Laplace smoothing is a very simple method that increases the observation count of every value f by some constant k. This corresponds to adding k to the numerator above, and k*V to the denominator (where V is the number of possible values the feature can take on). The higher the value of k, the stronger the smoothing. Experiment with different integer values of k from 1 to 5. While you need to find all the likelihoods for k=1 to 5, I'd like you to report the following values in your report: For k=1 and k=5, P(F682 = 0|class 8) and P(F772 = 1|class 9). Round to 3 decimal places.