Accurate prediction of the oligomerization state of coiled-coil domains based on the representation of monomeric structures obtained with Colabfold
DC2_oligo is a program for predicting the oligomerization state of homooligomeric coiled-coil domains. It is based on a logistic regression model trained on the 217 embeddings generated by the AlphaFold2 multimer_v3 model for a single sequence.
The input for training is a data frame /tests/set5_homooligomers.csv
and an embedding generated by the --save-representations
option in colabfold (.npy) files. Only embeddings are required for prediction.
Example of running AlphaFold2:
colabfold_batch 0_monomer.csv . --num-models 5 --model-type alphafold2_multimer_v3 --num-recycle 5 --save-single-representations
- Clone repository
git clone https://github.com/labstructbioinf/dc2_oligo
- Create a virtual environment (using conda, for example) and install the requirements
conda create -n dc2_oligo
cd dc2_oligo
pip install .
- Check if everything works with pytest
cd dc2_oligo
python -m pytest
python predict.py --cf_results DIR --save_csv STR
Argument | Description |
---|---|
--cf_results |
Colabfold output directory with saved embeddings via --save-representations option |
--save_csv |
Save csv with input filename (optional) |
python predict.py --cf_results tests/data/0 --save_csv testoutput.csv
For best results, enter sequences that contain only coiled coil domain. You can easily identify such a domain with DeepCoil predictor. Please use the AlphaFold2 multimer embeddings (alphafold2_multimer_v3).