ResNet50: bottom blob of expand layer #18

qilinli · 2017-10-13T09:07:05Z

Hi there,

Thanks for sharing the pre-trained models.
I am learning the ResNets50 and have a question about the architecture. It sames that there are quite few places different with original ResNets.

The data preprocess is changed from mean subtraction to batch normalization, which has been noted.

However I aware another main difference in the expanding convolution layer. For example the first one:

layer {
name: "layer_64_1_conv_expand"
type: "Convolution"
bottom: "layer_64_1_conv1"
top: "layer_64_1_conv_expand"
.......

It shows that the bottom blob come from "layer_64_1_conv1", which was "conv1_pool" in the original architecture. Is this a modification? As shown by your results that you can consistently improve the accuracy compared to the original implementation, it this the reason?

MarcelSimon · 2017-10-14T12:01:47Z

Hi!
There is a pooling in both my and Kaiming's implementation. I can't see what you mean. Could you please provide line numbers for both prototxts?

qilinli · 2017-10-14T14:06:15Z

@MarcelSimon sorry didn;t make it clear. I mean in the prototxt cnn-models/ResNet_preact/ResNet50_cvgj/train.prototxt line 295-318, which is the first expanding layer. Yours are expanded from "layer_64_1_conv1".

While in He's implementation deep-residual-networks/prototxt/ResNet-50-deploy.prototxt (cannot find train.prototxt) line 60-72 layer ''res2a_branch1'' (which corresponds to your expand layer, both use 1*1 convolution increase channel nums), the bottom layer is bottom: "pool1" which means he expands from the previous pooling layer.

And it's the same for all expanding layer. I think it is quite a big difference.

[ ]

MarcelSimon · 2017-10-16T08:50:00Z

I see, thanks a lot for pointing that out! The difference occurs only at the first expand layer, the other ones are correct.
The batch norm, scale and relu is shared, because it is the preactivation variant. However, the first expand should indeed use the conv1_pool as input. I will add a remark to the README soon

qilinli · 2017-10-17T04:09:01Z

As you mentioned the sharing batch norm, scale, it reminds me another difference between yours and He's implementation. If you check ther implementation http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006 ( graph)
or
https://github.com/KaimingHe/deep-residual-networks/blob/master/prototxt/ResNet-50-deploy.prototxt (prototxt)
they actually use two batch norm+ scale for two branches, which means they do not share them. While you indeed did the batch norm + scale after branch merge, which is shared.

MarcelSimon · 2017-10-18T08:41:48Z

The implementation you are referring to is the original ResNet, not the preactivation variant. Please see https://github.com/facebook/fb.resnet.torch/blob/master/models/preresnet.lua and https://github.com/KaimingHe/resnet-1k-layers/blob/master/resnet-pre-act.lua#L63 for the preactivation variant

qilinli · 2017-10-19T04:49:28Z

I see. Thanks a lot @MarcelSimon

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ResNet50: bottom blob of expand layer #18

ResNet50: bottom blob of expand layer #18

qilinli commented Oct 13, 2017

MarcelSimon commented Oct 14, 2017 •

edited

Loading

qilinli commented Oct 14, 2017

MarcelSimon commented Oct 16, 2017

qilinli commented Oct 17, 2017

MarcelSimon commented Oct 18, 2017 •

edited

Loading

qilinli commented Oct 19, 2017

ResNet50: bottom blob of expand layer #18

ResNet50: bottom blob of expand layer #18

Comments

qilinli commented Oct 13, 2017

MarcelSimon commented Oct 14, 2017 • edited Loading

qilinli commented Oct 14, 2017

MarcelSimon commented Oct 16, 2017

qilinli commented Oct 17, 2017

MarcelSimon commented Oct 18, 2017 • edited Loading

qilinli commented Oct 19, 2017

MarcelSimon commented Oct 14, 2017 •

edited

Loading

MarcelSimon commented Oct 18, 2017 •

edited

Loading