To the best of my knowledge, I noticed that the DropPath class is implemented, but in the code, the drop_path_rate parameter is never explicitly set to a value greater than 0.0. If the dropout probability remains at 0.0 by default, what is the purpose of including this class in the implementation? Is there a scenario where drop_prob is expected to be greater than 0.0, or is this class intended for future use or customization? Could you clarify the reasoning behind its inclusion?
# in DSTformer.py
if self.st_mode == 'stage_st':
x = x + self.drop_path(self.attn_s(self.norm1_s(x), sequence_length))
x = x + self.drop_path(self.mlp_s(self.norm2_s(x)))
x = x + self.drop_path(self.attn_t(self.norm1_t(x), sequence_length))
x = x + self.drop_path(self.mlp_t(self.norm2_t(x)))
elif self.st_mode == 'stage_ts':
x = x + self.drop_path(self.attn_t(self.norm1_t(x), sequence_length))
x = x + self.drop_path(self.mlp_t(self.norm2_t(x)))
x = x + self.drop_path(self.attn_s(self.norm1_s(x), sequence_length))
x = x + self.drop_path(self.mlp_s(self.norm2_s(x)))
# but here drop_path_rate: float = 0.0
# and I can`t find some code which set this parameter > 0
# Why?
self.drop_path = DropPath(drop_path_rate) if drop_path_rate > 0.0 else nn.Identity()