Consider a CNN that is made up of just one convolution layer consisting of just one filter F of shape 2 x 2 followed by a max-pooling layer of shape 2 x 2. The input image I is of shape 3 x 3. The output of the CNN is calculated as:
output = Pool(ReLU(Conv(I)))
where ReLU is the rectified linear activation function given by:
ReLU(x) = max(0, x)
Also assume that the stride for the convolution and pool layers is 1 and no padding is applied.
For the following values of the image I and weights of the filter F, compute the value of the output of the CNN:
I = [5 0 4
3 8 0
1 0 2]
F = [0 3
0 0]
Output: