Build a CNN with three layers. Train the network with the MNIST image dataset. For layer-1 and layer-2, use kernel-size=3, stride=2 with shape-preserving zero-padding. For layer-3, use kernel-size=2 and stride=3 and no padding. Use batch-norm and max-pooling for the first two layers but not for the last one. Use Tanh nonlinearity. Use 5 nodes in the first two layers and 3 in the last layer. Tune learning rate for best accuracy. Try two different learning rates. a- What are the shapes of kernels in each layer? b- Take an image from the dataset. Compute the outputs of the 1st layer. c- Compute the output of the 1st node in the 2nd layer for the image you picked in part-b