Plain Network vs. Residual Network
ImageNet Classification
ImageNet is a dataset that consists over 1000 classes. The models are trained on the 1.28 million training images, and evaluated on the 50k validation images.
Plain Network
The plain network is a simple stack of convolutional layers without residual connections. The detail of the plain network is as follows:
The authors first evaluated 18-layer and 34-layer plain nets. The results are as follows:
Model | Top-1 Error |
---|---|
18-layer plain net | 27.94 |
34-layer plain net | 28.54 |
We can see that the 34-layer plain net has a higher error rate than the 18-layer plain net.
Residual Network
The architecture of the residual network is basically the same as the plain network, but with residual connections added to each pair of 3×3 filters. Just like what shown in the previous picture. The results are as follows:
Model | Top-1 Error |
---|---|
18-layer ResNet | 27.88 |
34-layer ResNet | 25.03 |
We can see that the 34-layer ResNet has a lower error rate than the 18-layer ResNet. This is because the residual connections help the network to learn the identity function, which makes the network easier to optimize.
Comparison
In the picture above, Thin curves denote training error, and bold curves denote validation error of the center crops. We can easily find that for the plain network, the error of 34-layer network is higher than the 18-layer network. However, for the residual network, the error of 34-layer network is lower than the 18-layer network. And eventually, ResNet has a lower error on the 34-layer network than the plain network, which suggests that the accuracy of the network can be improved by adding residual connections.