In this repository you will find all methods, resources and scripts used and described in the following paper:
Sandin MM, Renaudie J, Suzuki N, Not F. Diversity and evolution of Radiolaria: Beyond the stars of the ocean. bioRxiv 2024.10.02.614131; doi: 10.1101/2024.10.02.614131
Briefly:
All environmental 18S rDNA sequences publicly available (as July 2020) associated to Radiolaria were taxonomically curated as detailed in the curation_pipeline.md and publicly accessible in the Protist Ribosomal Reference (PR2) database (from v4.14.0). In addition, the near full length rDNA sequences from Jamy et al., (2022) associated to Radiolaria were incorporated into our dataset.
In the folder alignments there are:
- A concatenated alignment used to infer the main phylogeny of Radiolaria (trimmed).
- A raw alignment containing 18S rDNA sequences for future studies. This alignment was obtained with MAFFT (using a L-INS-i algorithm and 1000 refinement cycles) and after triming (with trimAl and a 5% gap threshold) was used to generate the concatenated alignment.
- A raw alignment containing 28S rDNA sequences for future studies. This alignment was obtained as for the 18S alignment.
Here there are the phylogenetic trees in nexus format of the different phylogenetic analyses implemented in RAxML under the nucleotide substitution model GTR+CAT over 1000 bootstraps, RAxML-ng under the substitution model GTR+G and a third approach implemented in IQ-Tree under the model GTR+F+R10 (chosen based on the highest Bayesian Information Criterion from modelFinder):
- all_filtered_align-linsi_trim05_raxmlCAT.tre
- all_filtered_align-linsi_trim05_raxml-ngGTRg.tre
- all_filtered_align-linsi_trim05_iqtreeGTRg.tre
There are also available the fossil-calibrated trees obtained with MCMCTree from the PAML package and with BEAST2:
- all_filtered_align-linsi_trim05_raxmlCAT_BEAST2.tre
- all_filtered_align-linsi_trim05_raxmlCAT_MCMCTree.tre
In this folder there are the control and xml files to replicate molecular clock analyses, as well as a tsv version of the calibration model, and the metadata for metabarcoding analyses.
And finally this folder contains all scripts used in this study organized by main analyses, as phylogenetic, molecular clock and metabarcoding analyses.