Ribosomal proteins is a study and a universal tool written to analyze and pick best ribosomal proteins for potential evolutionary studies.
Protein selection was based on Phylosift Reference Marker Genes available here: https://phylosift.wordpress.com/tutorials/scripts-markers/
This project strongly refers to the discoveries described in a Nature Publication created by Zhu et al. 2019
*Phylogenomics of 10,575 genomes reveals evolutionary proximity between domains Bacteria and Archaea (2019) https://www.nature.com/articles/s41467-019-13443-4
Qiyun Zhu, Uyen Mai, Wayne Pfeiffer, Stefan Janssen, Francesco Asnicar, Jon G. Sanders, Pedro Belda-Ferre, Gabriel A. Al-Ghalith, Evguenia Kopylova, Daniel McDonald, Tomasz Kosciolek, John B. Yin, Shi Huang, Nimaichand Salam, Jian-Yu Jiao, Zijun Wu, Zhenjiang Z. Xu, Kalen Cantrell, Yimeng Yang, Erfan Sayyari, Maryam Rabiee, James T. Morton, Sheila Podell, Dan Knights, Wen-Jun Li, Curtis Huttenhower, Nicola Segata, Larry Smarr, Siavash Mirarab & Rob Knight
Paper's repository: https://biocore.github.io/wol/
Analysis plan and all stages of this work can be summed up in 2 main points:
-
Construct a table of organisms (RefSeq; Bacteria, Archaea, Eucaryotes) by ribosomal protein - mark what is known, outline appropriate IDs (UniProt, RefSeq, PDB) and data
-
Analyze outcomes to pick the best ribosomal proteins for in-depth structural studies (computational and molecular) in terms of evolution
The notebook contains step-by-step instructions, which provide explanatory information for each stage of the analysis.
All steps of the project were performed using Python 3 Jupyter Notebook in a Conda Environment.
Tools to install: Pandas, Numpy, Seaborn, Requests, IO, JSON and Matplotlib
Ribosomal proteins is an open-source project that any individual can use as a source, a starting point or a reference in their work. Pull requests or any contributions from the community are kindly welcome.