Cocos and libcocos are a parallel implementation of the
Approximately Unbiased [1] test for phylogenetic tree selection.
The test is designed for any selection problem that uses the RELL bootstrap method, [4]
but the cocos binary currently supports only TREE-PUZZLE-style [2]
per-site log-likelihood files
(which are also generated by RAxML-NG)
[3] for bootstrapping.
The library takes pre-parsed log-likelihood vectors as input and can therefore be used to apply the AU test to other inputs as well.
The library and CLI are still under development. Currently, they do not allow changing the scaling factors of the multiscale bootstrap process, nor the number of generated bootstrap replicates.
The binary provides a complete Unix-style CLI. Use cocos --help to list all available command-line options and usage
information.
Basic usage example:
cocos -i dataset.raxml.siteLH -o dataset.au.tsvThis generates BP values and AU p-values in dataset.au.tsv.
The random seed can be specified with -s SEED, the number of threads can be set with -t N.
If no thread number is given, cocos defaults to the number of available cores.
The libcocos crate provides the following crate-features:
rayonThis adds parallel versions of the exposed functions built on top of Rayon's global thread pool.serdeThis adds theSerializeandDeserializetraits toBpTableto allow serializing bootstrap results.simdThis replaces the scalar dot product with aportable_simdimplementation (nightly only), which compiles to the target platform's preferred vector instructions.
The cocos crate uses simd and rayon, so compiling it requires the nightly toolchain.
[1]: Hidetoshi Shimodaira, An Approximately Unbiased Test of Phylogenetic Tree Selection, Systematic Biology, Volume 51, Issue 3, 1 May 2002, Pages 492–508, https://doi.org/10.1080/10635150290069913
[2]: Heiko A. Schmidt, Korbinian Strimmer, Martin Vingron, Arndt von Haeseler, TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing , Bioinformatics, Volume 18, Issue 3, March 2002, Pages 502–504, https://doi.org/10.1093/bioinformatics/18.3.502
[3]: Alexey M Kozlov, Diego Darriba, Tomáš Flouri, Benoit Morel, Alexandros Stamatakis, RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference, Bioinformatics, Volume 35, Issue 21, November 2019, Pages 4453–4455, https://doi.org/10.1093/bioinformatics/btz305
[4]: Kishino, H., Miyata, T. & Hasegawa, M. Maximum likelihood inference of protein phylogeny and the origin of chloroplasts. J Mol Evol 31, 151–160 (1990). https://doi.org/10.1007/BF02109483