From 46c8c10f14a8ff2f229b2dc5dbcf597e753adc35 Mon Sep 17 00:00:00 2001 From: Haris Zafeiropoulos Date: Wed, 26 Jul 2023 13:52:57 +0200 Subject: [PATCH] add taxonomy data products; attempt for image --- docs/data_products.rst | 107 ++++++++++++++++++++++++++++++++++++++++- docs/usage.rst | 9 ++-- 2 files changed, 110 insertions(+), 6 deletions(-) diff --git a/docs/data_products.rst b/docs/data_products.rst index a75e8590..46e61e60 100644 --- a/docs/data_products.rst +++ b/docs/data_products.rst @@ -7,9 +7,9 @@ Description of ``metaGOflow``'s data products Quality filtering step ----------------------- -- ```*.fastq.trimmed.fasta`` **files** +- ``*.fastq.trimmed.fasta`` **files** Filtered .fasta files of the forward (R1) and reverse (R2) reads. Its content strongly depends on the -``fastp``-related `:doc:/args_and_params` parameters. +``fastp``-related :doc:`/args_and_params` parameters. A record in a .fasta file consists of 2 parts: a *header* that always starts with a ``>``` and describes the sequence (experiment id, coordinates etc.) and the sequence. Example: @@ -70,6 +70,109 @@ This file is necessary for running the `mOTUs package `_ is a widely supported binary format with native parsers available within many programming languages. + + + + +- ``krona.html`` **files** + + +A hierarchical visual component of the taxonomic profile based on the LSU and the SSU accordingly. + + +.. image:: images/krona.png + :width: 850 + + + + +Gene prediction step +-------------------- + + +Functional annotation step +-------------------------- + + +Assembly step +------------- diff --git a/docs/usage.rst b/docs/usage.rst index 83e383a7..38364411 100644 --- a/docs/usage.rst +++ b/docs/usage.rst @@ -12,8 +12,7 @@ Raw data The sequences file can be provided to ``metaGOflow`` directly or an ENA accession id of the run of intereste can be provided and ``metaGOflow`` will fetch the data automatically. - -Fill in the ``config.yml`` file and set the parameters as described in the :doc:`/args_and_params`. +.. attention:: ``metaGOflow`` is not valid for the analysis of long reads samples Run ``metaGOflow`` @@ -23,6 +22,8 @@ Assuming ``metaGOflow`` is about to perform in a HPC environment where `Singular and that we have built a ``conda`` environment as shown in :doc:`/installation` let's break down how we would execute a run given the ``config.yml`` is set. +About the ``config.yml`` file and how to set the parameters on it, you may see the :doc:`/args_and_params` section. + .. code-block:: bash @@ -117,11 +118,11 @@ In the same place, the output of the assembly step (``final.contigs.fa``) will b * - ``*.merged.fasta`` - Merged filtered sequences * - ``*.merged.motus.tsv`` - - Merged sequences MOTUs + - mOTUs along with their taxonomic assignment and their abundance * - ``*.merged.qc_summary`` - Quality control (QC) summary of the merged sequences * - ``*.merged.unfiltered_fasta`` - - Merged sequences that did not pass the filtering + - Merged sequences with clean headers * - ``fastp.html`` - FASTP analysis of raw sequence data * - ``final.contigs.fa``