Consider the neural network shown below, with two inputs 1, 2, one output y, and one hidden layer with three neurons. The weights are as shown in the diagram, and you may assume that the biases are all zero and the sigmoid function is used as the activation throughout.
1.1 W2.
2
(a) By representing the weight parameters in matrix form, write down matrix-vector expressions for a1, the output of the hidden layer, and y, the network output. ac and W1, W2 are the weights between the inputs and hidden layer, and between the hidden layer and output, respectively.
(c) Let W1 = 0.4 0.5 and W2 = 0.8 0.6 0.9 0.7 0.1. Calculate the output values at each node in the hidden layer and at the output y for input values x = 0, x = 1.
(d) The mean-squared error loss function for N examples is defined as
N
C = Σ (y - y)^2
i=1
where y is the network output of that example at the output node, and y is the target output label for that example. For input x1 = 0, x2 = 1, and target output = 1, compute the updated network weights by performing one step of gradient descent. Show all steps in your calculation.