Problem discussed : Width(number of neurons in a layer) vs Depth (number of layers) in a deep neural network.

Architecture: The authors propose an architecture which uses residual blocks introduced by He et al [1] but instead of going deeper they experimented by increasing the width of the network. They use dropout to regularize the wider networks. For comparison,  they experimented with architectures of different depth and width with the  same number of parameters and showed that the wider version achieves better accuracy on CIFAR10  and CIFAR100 classification task. Test error on CIFAR10 and CIFAR100 is shown in the table below.


Wide ResNet 40-4 means the network is 40 layers deep and 4 times wider than the baseline.

The code is available here.


1.) Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

2.) Wide Residual Networks. Sergey Zagoruyko, Nikos Komodakis