- Generative pretraining on SMILE & Amino acids sequences ✅
- Prove that learnt embeddings are meaningful ✅
- Determine which models work best on DTI prediction (contrastive leanrning-based, crossentropy based, etc)
- Interpretability (Attention based models / addition of attention mechanism in alignment of SMILE and amino embeddings)