Skip to content

Commit

Permalink
Merge pull request #43 from EBI-Metagenomics/dev
Browse files Browse the repository at this point in the history
Dev -> main for release v1.3
  • Loading branch information
mberacochea authored Jul 23, 2024
2 parents 3bfc983 + 5e2540e commit 5fe9266
Show file tree
Hide file tree
Showing 11 changed files with 47 additions and 28 deletions.
2 changes: 1 addition & 1 deletion .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@ repository_type: pipeline
lint:
files_exist:
- CODE_OF_CONDUCT.md
- CHANGELOG.md
- assets/nf-core-mettannotator_logo_light.png
- docs/images/nf-core-mettannotator_logo_light.png
- docs/images/nf-core-mettannotator_logo_dark.png
- .github/ISSUE_TEMPLATE/config.yml
- conf/test_full.config
- docs/output.md
- docs/README.md
- docs/README.md
- docs/usage.md
- conf/igenomes.config
- .github/workflows/awstest.yml
Expand Down
16 changes: 0 additions & 16 deletions CHANGELOG.md

This file was deleted.

4 changes: 4 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# ebi-metagenomics/mettannotator: Citations

## [mettannotator: a comprehensive and scalable Nextflow annotation pipeline for prokaryotic assemblies (pre-print)](https://doi.org/10.1101/2024.07.11.603040)

> Gurbich TA, Beracochea M, De Silva NH, Finn RD. mettannotator: a comprehensive and scalable Nextflow annotation pipeline for prokaryotic assemblies. doi: https://doi.org/10.1101/2024.07.11.603040
## [MGnify Genomes](https://pubmed.ncbi.nlm.nih.gov/36806692/)

> Gurbich TA, Almeida A, Beracochea M, Burdett T, Burgin J, Cochrane G, Raj S, Richardson L, Rogers AB, Sakharova E, Salazar GA and Finn RD. MGnify Genomes: A Resource for Biome-specific Microbial Genome Catalogues. J Mol Biol. 2023 Jul; 435(14). doi: https://doi.org/10.1016/j.jmb.2023.168016. PubMed PMID:
Expand Down
16 changes: 15 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,9 @@ Input/output options
--multiqc_title [string] MultiQC report title. Printed as page header, used for filename if not otherwise specified.
Reference databases
--dbs [string] Folder for the tools' reference databases used by the pipeline for downloading.
--dbs [string] Folder for the tools' reference databases used by the pipeline for downloading. It's important to note that
mixing the --dbs flag with individual database paths and versions is not allowed; they are mutually
exclusive.
--interproscan_db [string] The InterProScan reference database, ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/
--interproscan_db_version [string] The InterProScan reference database version. [default: 5.62-94.0]
--interpro_entry_list [string] TSV file listing basic InterPro entry information - the accessions, types and names,
Expand Down Expand Up @@ -262,6 +264,18 @@ nextflow run ebi-metagenomics/mettannotator \
> provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_;
> see [docs](https://nf-co.re/usage/configuration#custom-configuration-files).
#### Local execution

The pipeline can be run on a desktop or laptop, with the caveat that it will take a few hours to complete depending on the resources. There is a local profile in the Nextflow config that limits the total resources the pipeline can use to 8 cores and 12 GB of RAM. In order to run it (Docker or Singularity are still required):

```bash
nextflow run ebi-metagenomics/mettannotator \
-profile local,<docker or singulairty> \
--input assemblies_sheet.csv \
--outdir <OUTDIR> \
--dbs <PATH/TO/WHERE/DBS/WILL/BE/SAVED>
```

### Gene caller choice

By default, `mettannotator` uses Prokka to identify protein-coding genes. Users can choose to use Bakta instead by
Expand Down
1 change: 1 addition & 0 deletions assets/methods_description_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ data: |
<p>${tool_citations}</p>
<h4>References</h4>
<ul>
<li>Gurbich, T. A., Beracochea, M., De Silva, N. H., & Finn, R. D. (2024). mettannotator: a comprehensive and scalable Nextflow annotation pipeline for prokaryotic assemblies. doi: <a href="https://doi.org/10.1101/2024.07.11.603040">10.1101/2024.07.11.603040</a></li>
<li>Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4), 316-319. doi: <a href="https://doi.org/10.1038/nbt.3820">10.1038/nbt.3820</a></li>
<li>Ewels, P. A., Peltzer, A., Fillinger, S., Patel, H., Alneberg, J., Wilm, A., Garcia, M. U., Di Tommaso, P., & Nahnsen, S. (2020). The nf-core framework for community-curated bioinformatics pipelines. Nature Biotechnology, 38(3), 276-278. doi: <a href="https://doi.org/10.1038/s41587-020-0439-x">10.1038/s41587-020-0439-x</a></li>
<li>Grüning, B., Dale, R., Sjödin, A., Chapman, B. A., Rowe, J., Tomkins-Tinch, C. H., Valieris, R., Köster, J., & Bioconda Team. (2018). Bioconda: sustainable and comprehensive software distribution for the life sciences. Nature Methods, 15(7), 475–476. doi: <a href="https://doi.org/10.1038/s41592-018-0046-7">10.1038/s41592-018-0046-7</a></li>
Expand Down
2 changes: 1 addition & 1 deletion assets/multiqc_config.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
report_comment: >
This report has been generated by the <a href="https://github.com/ebi-metagenomics/mettannotator/1.0dev" target="_blank">ebi-metagenomics/mettannotator</a>
This report has been generated by the <a href="https://github.com/ebi-metagenomics/mettannotator" target="_blank">ebi-metagenomics/mettannotator</a>
analysis pipeline.
report_section_order:
"ebi-metagenomics-mettannotator-methods-description":
Expand Down
15 changes: 11 additions & 4 deletions modules/local/eggnog.nf
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,25 @@ process EGGNOG_MAPPER {
path "versions.yml", emit: versions

script:
def db_mem_flag = ""
/*
The required memory of the executor needs to be greater than 44GB
to be able to load the eggnog SQLite database into memory.
Docs: https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.12#a-few-recipes
*/
if ( task.memory >= 44.GB ) {
db_mem_flag = "--dbmem"
}
if ( mode == "mapper" )
"""
emapper.py -i ${fasta} \
--database ${eggnog_db_dir}/eggnog.db \
--dmnd_db ${eggnog_db_dir}/eggnog_proteins.dmnd \
--data_dir ${eggnog_db_dir} \
--dbmem \
-m diamond \
--no_file_comments \
--cpu ${task.cpus} \
--no_annot \
--no_annot ${db_mem_flag} \
-o ${meta.prefix}
cat <<-END_VERSIONS > versions.yml
Expand All @@ -43,8 +51,7 @@ process EGGNOG_MAPPER {
--no_file_comments \
--cpu ${task.cpus} \
--tax_scope 'prokaryota_broad' \
--dbmem \
--annotate_hits_table ${annotation_hit_table} \
--annotate_hits_table ${annotation_hit_table} ${db_mem_flag} \
-o ${meta.prefix}
cat <<-END_VERSIONS > versions.yml
Expand Down
2 changes: 1 addition & 1 deletion modules/local/interproscan.nf
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ process INTERPROSCAN {
if (workflow.containerEngine == 'singularity') {
return "--bind ${interproscan_db}/data:/opt/interproscan-5.62-94.0/data"
} else {
return "-v ${interproscan_db}/data:/opt/interproscan-5.62-94.0/data"
return "-v ./${interproscan_db}/data:/opt/interproscan-5.62-94.0/data"
}
}

Expand Down
3 changes: 2 additions & 1 deletion modules/local/unifire.nf
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,12 @@ process UNIFIRE {
label 'error_retry'

container "dockerhub.ebi.ac.uk/uniprot-public/unifire:2023.4"

containerOptions {
if (workflow.containerEngine == 'singularity') {
return "--bind unifire:/volume"
} else {
return "-v unifire:/volume"
return "-v ./unifire:/volume"
}
}

Expand Down
12 changes: 10 additions & 2 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,14 @@ profiles {
}
test { includeConfig 'conf/test.config' }

local {
params {
// Any modern laptop / desktop should have at least...
max_memory = "12GB"
max_cpus = 8
}
}

ebi {
params.workdir = "/hps/nobackup/rdf/metagenomics/service-team/nextflow-workdir/mett-pipeline"
params.singularity_cachedir = "/hps/nobackup/rdf/metagenomics/service-team/singularity-cache/"
Expand Down Expand Up @@ -274,8 +282,8 @@ manifest {
description = """ME TT assembly annotation pipeline"""
mainScript = 'main.nf'
nextflowVersion = '!>=23.04.0'
version = '1.2'
doi = ''
version = '1.3'
doi = 'https://doi.org/10.1101/2024.07.11.603040'
}

// Load modules.config for DSL2 module specific options
Expand Down
2 changes: 1 addition & 1 deletion nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@
"dbs": {
"type": "string",
"format": "directory-path",
"description": "Folder for the tools' reference databases used by the pipeline for downloading.",
"description": "Folder for the tools' reference databases used by the pipeline for downloading. It's important to note that mixing the --dbs flag with individual database paths and versions is not allowed; they are mutually exclusive.",
"help_text": "Set this parameter to trigger the reference database download; otherwise, specify the databases individually."
},
"interproscan_db": {
Expand Down

0 comments on commit 5fe9266

Please sign in to comment.