Skip to content

A tutorial for Metagenome-Atlas given at ECCB

License

Notifications You must be signed in to change notification settings

AIqbal94/Tutorial

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Metagenome-Atlas Tutorial for ECCB

This is a tutorial for Metagenome-Atlas at ECCB. Metagenome-Atlas is an easy-to-use pipeline for analyzing metagenomic data. It handles all steps from QC, Assembly, Binning, to Annotation.

⁉️ If you have any question or errors write us.

Do you need some music to work. Have a look at this spotify playlist for bioinformaticians.

Tutorial part 1: What can you get out of Atlas?

Analyze the output of Atlas

checkmquality

Usually before starting to install a program I want to make sure that it gives the output I want. Therefore, we start analyzing the output of Metagenome-Atlas. In part 2 of the Tutorial you learn how to run metagenome atlas on some test data or on your own.

Example output of Metagenome Atlas

A real Metagenome-Atlas run can take more than a day. Therefore we have you prepared some output of a previous run the Example folder.
This cool report✨ shows the most interesting output of Atlas.

Here are some questions that guide you through the Atlas output files:

  • The main goal of atlas is to create metgenome assembled genomes (MAGs). The genomes are in the folder genomes/genomes. How many are there?

  • The Taxonomy is based on the Genome Taxonomy Database (GTDB) which tries to be consistent with genome distances. The taxonomy can be found in Results/taxonomy.tsv. Which is the phylum with the most genomes? How many species are new/unnamed?

  • MAGs are usually not complete or can contain some contamination. Atlas estimates the quality of genomes using checkM. Which is the species with the highest completeness and lowest quality? (You can zoom in in the plot in the summary report).

  • Quantification is based on mapping the reads to the genomes using bbmap. Do you think the mapping rate is good? If reads from the host (Mouse) would have been filtered out do you think the mapping rate would be higher?

  • The output is based on 15 samples from mice fecal samples from the project PRJEB7759. 8 of them were fed a high fat diet and got obese. Can you see the difference between the groups in the PCA?

  • We use the median coverage over the genomes to calculate the relative abundance. What is the most abundant species in these metagenomes?

  • Functional annotation is based on the output of EggNOG mapper of all the genes. Using the link between the genes and the genomes we can calculate which function is present in which genome (genomes/gene2genome.tsv.gz). Finally, the relative abundance of functions (Results/annotations).

Metagenome-Atlas produces a lot of other outputs from the QC and assembly steps. They are summarized reports such as these ones:

Run script for differential abundance analysis

We prepared a jupyter notebook with the code for differential analysis. The goal is to find out which changes are associated with High fat diet induced obesity in mice. To analyze the data, we will install some python packages.

Picture of obese mice

Setup

The part of the tutorial works on linux and macOS and maybe on Windows. You need to install the conda package manager. Download this repo with git or download it as zip archive and extract it.

git clone https://github.com/metagenome-atlas/Tutorial.git
cd Tutorial

The script use various python package for analyzing and plotting. Set them up by running

cd scripts
./setup.sh

This creates a conda-environment in order not to interfere with other software on your computer. Activate the environment by running, then start jupyter:

conda activate analyze
jupyter lab

Click on the Differential_abundance.ipynb and start the differential analysis.

If something doesn't work, let us know. You can always have a look at the notebook to see what would be the output.

Tutorial part 2: Let's get serious!

Install and run metagenome Atlas

In the second part of the tutorial you will install metagenome-atlas on your system and test it with a small dataset. As real metagenomic assembly can take more than 250GB ram and multiple processors, you would ideally do this directly on a high-performance system, e.g. the cluster of your university. You can install minconda in your home directory if it is not installed on your system.

Follow this link

If you want only do the test dataset, you can do most steps on a normal laptop (Mac or Linux).

About

A tutorial for Metagenome-Atlas given at ECCB

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • MAXScript 62.2%
  • Jupyter Notebook 24.5%
  • HTML 13.2%
  • Other 0.1%