Do bioinformatics not sys-admining - run the playbook and get back to work !
We assume some familiarity with Ansible.
- Bring up a VM (AWS, OpenStack / NeCTAR, etc)
- Add your ssh-keys to
~/.ssh/authorized_keyson the instance - Edit the
hostsfile to add the target VM IP address, editgroup_vars/allfiles to change:sudo_guyto the username used to log into the remote machinemain_guyto a username that will be created for installing software (can be the same assudo_guy)
ansible-playbook -i hosts all.ymlThis bio-ansible is multi-potent as it can set up from scratch the whole army of servers with bioinformatics (genomic) focus or just install handful of selected tools. A subset of the bio-ansible playbooks can be run as a as a non-privileged user, in particular if you are just installing bio-tools in your home directory on a shared system (eg HPC).
However you still might need to install some "common" dependencies and for that
you might need sudo. Also note that Ansible tasks are intended to be
‘idempotent’, meaning if you run them again, they will generally only make the
changes they must in order to bring the system to the desired state. This means
it is safe to rerun the same playbook multiple times.
These playbooks target Ubuntu 20.04 and 22.04 - they may work with small modifications on newer Ubuntu releases and other Debian-flavoured distros. YMMV.
-
mkdir ~/.virtualenvs virtualenv -p python3 ~/.virtualenvs/ansible source ~/.virtualenvs/ansible/bin/activate pip3 install -U pip # bio-ansible requires Ansible 8 (ansible-core 2.15.x), newer versions may work pip3 install -U "ansible==8"
-
Clone the git repo:
git clone https://github.com/MonashBioinformaticsPlatform/bio-ansible.git
-
Install required Ansible Galaxy roles:
ansible-galaxy collection install community.crypto
-
Edit
hostsfile to include the remote host IP addresses into the appropriate group. If running against remote host(s), setup your ssh-keys and usessh-addto add them to the local sss-agent. -
Edit
group/allfile to include your username asmain_guyvariable (this is the username used to access the target host[s]) -
Optional: Download any tar archives for non-FOSS software into
tarballs/(or the path set in thetarballs_pathvariable) - see the section on manually downloading tarballs below.
Install many bioinformatics tools as 'modules'.
This is often possible as a non-privileged user without sudo.
The user defined in the main_guy variable is used:
ansible-playbook -i hosts bio.ymlInstall system-wide dependencies and packages - sudo privilege is required:
ansible-playbook -i hosts common.ymlInteracive web-based services - sudo privilege is required:
ansible-playbook -i hosts common.ymlOr, if you want to try installing everything above in one go (sudo privilege is required on the target host[s]):
ansible-playbook -i hosts all.ymlAlternatively you can install specific tools without running the whole playbook by using tags:
ansible-playbook -i hosts bio.yml --tags samtools,star,subreadYou can see all available tags for a playbook with:
ansible-playbook bio.yml --list-tagsProtip: You can always add -v or -vvv options for verbose mode to help
diagnose failures
Some modules are installed via shpc, which formalizes wrapping up Singularity containers as LMOD modules.
Users can also install their own modules with a small amount of configuration. You can find many tools
pre-packaged for shpc at the shpc-registry.
Users should run:
shpc config inituser
# Create a directory for all user shpc containers and module definitions
mkdir $HOME/shpc
shpc config set container_base $HOME/shpc/containers
shpc config set module_base $HOME/shpc/modules
shpc config set views_base $HOME/shpc/views
# Make LMOD aware of the users module definitions
module use $HOME/shpc/modules
# Make the MODUPLEPATH setting more permanent
echo -e '\nexport MODULEPATH=$HOME/shpc/modules:$MODULEPATH' >>~/.bashrcSee README.docker.md
- Playbook breakdown
- Full list of available tags
- List of required dependencies
- List of python pip packages
- List of supported talball packages
- To Do list
Because of the licenses some installation files need to be manually downloaded
into a 'tarballs' directory. By default this is tarballs in the playbook base
path - this location can be set using the tarballs_path variable if required.
The playbook.yml will skip installation of those packages if it doesn't find
the archive files in that directory.
There are scripts to download various databases in scripts/. These have
deliberately not been added to ansible.