Are the downloadable finetuned weights for secondary structure prediction intra- or inter-familty trained? #16

shanry · 2025-02-24T01:24:16Z

I noticed that the the split in the archiveII dataset (fam-fold) is intra-family based. However, the paper claims that the fine-tuning on archiveII used an inter-family split.

Could the author clarify a bit about how the downloadable weights are fine-tuned?

The text was updated successfully, but these errors were encountered:

RJPenic · 2025-02-24T10:37:59Z

Hello 😄,
As stated in the paper: "The dataset of 3865 RNAs from nine families was split nine times, and in each split, a different family was held out for evaluation while the other eight families were used for training and validation."

I just tried out weights that were used for SRP family evaluation and results are as expected. Are you sure you are using the right weights? File rinalmo_giga_ss_archiveII-srp_ft.pt contains weights of the model trained (and validated) with all families except SRP.

shanry · 2025-02-26T07:34:41Z

Thank you for your clarification! I’m now able to obtain the expected results for SRP using rinalmo_giga_ss_archiveII-srp_ft.pt.

If I want to replicate the finetuning process as well, do I need to reorganize the folder hierarchy so that it follows a "one family vs. all other families" structure? Could you confirm if this is the correct approach?

shanry · 2025-02-27T09:11:59Z

I did use the default data splits to fine-tune the pre-trained model (it seems the files are already organized in an inter-family format). However, the test results are far from expected. Could you please see if the hyperparameters are properly set?

I was using the cmd: /nfs/hpc/share/zhoutian/repos/RiNALMo/train_sec_struct_prediction.py ./ss_data2 --pretrained_rinalmo_weights ./weights/rinalmo_giga_pretrained.pt --output_dir myft_5s --dataset archiveII_5s --accelerator gpu --max_epochs 15 --wandb --ft_schedule ft_schedules/giga_sec_struct_ft.yaml

The summary metrics are:
{
"_runtime": 12287.808371543884,
"_step": 6562,
"_timestamp": 1740607260.1589224,
"_wandb.runtime": 12289,
"epoch": 15,
"lr-Adam": 0.00008306000000000176,
"lr-Adam/pg1": 0.000050000000000001425,
"lr-Adam/pg2": 0.000050000000000001425,
"lr-Adam/pg3": 0.000050000000000001425,
"lr-Adam/pg4": 0.000050000000000001425,
"lr-Adam/pg5": 0.000050000000000001425,
"test/f1": 0.35909613966941833,
"test/loss": 0.029532011243427175,
"test/precision": 0.563310444355011,
"test/recall": 0.2690052092075348,
"train/loss": 0.0020109469678422976,
"trainer/global_step": 34860,
"val/f1": 0.9452812671661376,
"val/loss": 0.00011648952716895472,
"val/threshold": 0.12999999523162842
}

Notice the validation f1 is very high but test f1 is very low.

RJPenic · 2025-02-28T09:11:22Z

I found a few discrepancies in the code compared to our internal version that caused the learning rate to be higher than expected. Thanks for pointing this out. High learning rates during fine-tuning tend to overwrite pre-trained knowledge of the LM. I pushed new changes to the main branch. Could you please pull the latest commit and try repeating the experiments? Results should now align with what we reported in the paper.

You can use this command to run the experiment: python train_sec_struct_prediction.py ./ss_data/ --pretrained_rinalmo_weights ./weights/rinalmo_giga_pretrained.pt --output_dir ./tmp_out --dataset archiveII_5s --accelerator gpu --devices 1 --max_epochs 15 --tune_threshold_every_n_epoch 15 --ft_schedule ft_schedules/giga_sec_struct_ft.yaml --precision bf16-mixed --num_workers 4

shanry · 2025-03-01T08:21:25Z

Thank you for double-checking the hyperparameters and updating the code—I really appreciate it. I pulled the latest version and fine-tuned the model on three datasets: archiveII_5s, archiveII_srp, and bpRNA. Here are the test metrics for reference:

5s: F1 = 0.860, Precision = 0.969, Recall = 0.780
srp: F1 = 0.695, Precision = 0.781, Recall = 0.649
bpRNA: F1 = 0.742, Precision = 0.777, Recall = 0.726
The results for srp and bpRNA align well with the paper, though there is a gap in the F1 score for the 5s family.

Before proceeding with fine-tuning on other archiveII families, I wanted to check if there’s anything I might have overlooked. For instance, I noticed that the default batch size is set to 1, which is uncommon in machine learning. Was this the batch size used in the paper?

Looking forward to your thoughts!

RJPenic · 2025-03-01T14:01:00Z

Yes, batch size was set to one in the paper as well. While it is true that such batch size is uncommon in machine learning, it isn't that unusual when it comes to SS prediction based on DL (for example MXfold2 and Ufold also had their training batch sizes set to one). Main problem is that for SS prediction you usually need to featurize/model all possible nucleotide pairings which leads to a quadratic memory complexity. As we conducted fine-tuning experiments on a bit "weaker" GPUs (12GB - 16GB) compared to the GPUs we used for pre-training (A100, ~80GB), we decided to set the batch size to one to be able to process a bit longer sequences during training.

fhidalgor · 2025-03-09T01:38:30Z

Hey Rafael - Thanks for the amazing worked, I enjoyed reading your manuscript. I was wondering if it would be possible to make available the secondary structure prediction finetuned weights for the 150M parameter model. I only see the pre-trained ones. Thanks!

RJPenic added weights fine-tuning labels Mar 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are the downloadable finetuned weights for secondary structure prediction intra- or inter-familty trained? #16

Are the downloadable finetuned weights for secondary structure prediction intra- or inter-familty trained? #16

shanry commented Feb 24, 2025 •

edited

Loading

RJPenic commented Feb 24, 2025

shanry commented Feb 26, 2025

shanry commented Feb 27, 2025

RJPenic commented Feb 28, 2025

shanry commented Mar 1, 2025

RJPenic commented Mar 1, 2025 •

edited

Loading

fhidalgor commented Mar 9, 2025

Are the downloadable finetuned weights for secondary structure prediction intra- or inter-familty trained? #16

Are the downloadable finetuned weights for secondary structure prediction intra- or inter-familty trained? #16

Comments

shanry commented Feb 24, 2025 • edited Loading

RJPenic commented Feb 24, 2025

shanry commented Feb 26, 2025

shanry commented Feb 27, 2025

RJPenic commented Feb 28, 2025

shanry commented Mar 1, 2025

RJPenic commented Mar 1, 2025 • edited Loading

fhidalgor commented Mar 9, 2025

shanry commented Feb 24, 2025 •

edited

Loading

RJPenic commented Mar 1, 2025 •

edited

Loading