-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ResNet50: bottom blob of expand layer #18
Comments
Hi! |
@MarcelSimon sorry didn;t make it clear. I mean in the prototxt cnn-models/ResNet_preact/ResNet50_cvgj/train.prototxt line 295-318, which is the first expanding layer. Yours are expanded from "layer_64_1_conv1". While in He's implementation deep-residual-networks/prototxt/ResNet-50-deploy.prototxt (cannot find train.prototxt) line 60-72 layer ''res2a_branch1'' (which corresponds to your expand layer, both use 1*1 convolution increase channel nums), the bottom layer is bottom: "pool1" which means he expands from the previous pooling layer. And it's the same for all expanding layer. I think it is quite a big difference.
|
I see, thanks a lot for pointing that out! The difference occurs only at the first expand layer, the other ones are correct. |
As you mentioned the sharing batch norm, scale, it reminds me another difference between yours and He's implementation. If you check ther implementation http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006 ( graph) |
The implementation you are referring to is the original ResNet, not the preactivation variant. Please see https://github.com/facebook/fb.resnet.torch/blob/master/models/preresnet.lua and https://github.com/KaimingHe/resnet-1k-layers/blob/master/resnet-pre-act.lua#L63 for the preactivation variant |
I see. Thanks a lot @MarcelSimon |
Hi there,
Thanks for sharing the pre-trained models.
I am learning the ResNets50 and have a question about the architecture. It sames that there are quite few places different with original ResNets.
However I aware another main difference in the expanding convolution layer. For example the first one:
layer {
name: "layer_64_1_conv_expand"
type: "Convolution"
bottom: "layer_64_1_conv1"
top: "layer_64_1_conv_expand"
.......
It shows that the bottom blob come from "layer_64_1_conv1", which was "conv1_pool" in the original architecture. Is this a modification? As shown by your results that you can consistently improve the accuracy compared to the original implementation, it this the reason?
The text was updated successfully, but these errors were encountered: