Consider a 5-layer neural network with the following architecture:
$n^{[0]} = 1024, n^{[1]} = 256, n^{[2]} = 128, n^{[3]} = 32, n^{[4]} = 8, n^{[5]} = 3$
Which layer should be assigned a higher dropout probability when using dropout regularization?
3
1
4
2