Remove zero initialization of `to_attn_bias` weights, since for these `bias=False` #302

amorehead · 2024-10-03T20:42:22Z

Line 8 of Algorithm 24 uses a LinearNoBias layer following LayerNorm to merge pairwise with attention bias representations, as follows

However, the code has the LinearNoBias weights also initialized with zeros, meaning both the weights and biases of these modules are initially zero (or null) which leads to these weights receiving no gradients throughout training.

Update alphafold3.py

185bca4

lucidrains merged commit 665ffef into lucidrains:main Oct 3, 2024
11 checks passed

amorehead deleted the patch-2 branch October 3, 2024 20:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove zero initialization of `to_attn_bias` weights, since for these `bias=False` #302

Remove zero initialization of `to_attn_bias` weights, since for these `bias=False` #302

amorehead commented Oct 3, 2024

Remove zero initialization of to_attn_bias weights, since for these bias=False #302

Remove zero initialization of to_attn_bias weights, since for these bias=False #302

Conversation

amorehead commented Oct 3, 2024

Remove zero initialization of `to_attn_bias` weights, since for these `bias=False` #302

Remove zero initialization of `to_attn_bias` weights, since for these `bias=False` #302