Skip to content

Commit 6893894

Browse files
Merge pull request #525 from nf-core/dev
v1.2 - Bouncy Basenji Release PR
2 parents 5d3ee55 + e93eee3 commit 6893894

File tree

335 files changed

+16018
-1443
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

335 files changed

+16018
-1443
lines changed

.github/workflows/ci.yml

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ jobs:
3636
- "test_motus"
3737
- "test_falco"
3838
- "test_fastp"
39-
- "test_adapterremoval"
39+
- "test_alternativepreprocessing"
4040
- "test_bbduk"
4141
- "test_prinseqplusplus"
4242

@@ -65,8 +65,10 @@ jobs:
6565
if [[ "${{ matrix.tags }}" == "test_motus" ]]; then
6666
wget https://raw.githubusercontent.com/motu-tool/mOTUs/master/motus/downloadDB.py
6767
python downloadDB.py --no-download-progress
68-
echo 'tool,db_name,db_params,db_path' > 'database_motus.csv'
69-
echo "motus,db_mOTU,,db_mOTU" >> 'database_motus.csv'
68+
echo 'tool,db_name,db_params,db_type,db_path' > 'database_motus.csv'
69+
echo "motus,db1_mOTU,,short,db_mOTU" >> 'database_motus.csv'
70+
echo "motus,db2_mOTU,,long,db_mOTU" >> 'database_motus.csv'
71+
echo "motus,db3_mOTU,,short;long,db_mOTU" >> 'database_motus.csv'
7072
nextflow run ${GITHUB_WORKSPACE} -profile docker,${{ matrix.tags }} --databases ./database_motus.csv --outdir ./results_${{ matrix.tags }};
7173
else
7274
nextflow run ${GITHUB_WORKSPACE} -profile docker,${{ matrix.tags }} --outdir ./results_${{ matrix.tags }};

CHANGELOG.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,47 @@
33
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
44
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
55

6+
## v1.2 - Bouncy Basenji [2024-10-03]
7+
8+
### `Added`
9+
10+
- [#417](https://github.com/nf-core/taxprofiler/pull/417) Added reference-free metagenome complexity/coverage estimation with Nonpareil (added by @jfy133)
11+
- [#466](https://github.com/nf-core/taxprofiler/pull/466) Input database sheets can specify a `db_type` column to distinguish between short- and long-read databases (added by @LilyAnderssonLee)
12+
- [#505](https://github.com/nf-core/taxprofiler/pull/505) Add small files to the file `tower.yml` (added by @LilyAnderssonLee)
13+
- [#508](https://github.com/nf-core/taxprofiler/pull/508) Add `nanoq` as a filtering tool for nanopore reads (added by @LilyAnderssonLee)
14+
- [#511](https://github.com/nf-core/taxprofiler/pull/511) Add `porechop_abi` as an alternative adapter removal tool for long reads nanopore data (added by @LilyAnderssonLee)
15+
- [#512](https://github.com/nf-core/taxprofiler/pull/512) Update all tools to the latest version and include nf-test (updated by @LilyAnderssonLee & @jfy133)
16+
- [#537](https://github.com/nf-core/taxprofiler/pull/537) Update the module `motus/merge` to the latest version (Updated by @sofstam & @LilyAnderssonLee)
17+
18+
### `Fixed`
19+
20+
- [#518](https://github.com/nf-core/taxprofiler/pull/518) Fixed a bug where Oxford Nanopore FASTA input files would not be processed (❤️ to @ikarls for reporting, fixed by @jfy133)
21+
- [#523](https://github.com/nf-core/taxprofiler/pull/523) Removed hardcoded `-m lca` from GANON_CLASSIFY due to more options in new version of ganon (fixed by @LilyAnderssonLee & @jfy133)
22+
- [#531](https://github.com/nf-core/taxprofiler/pull/531) Fix FASTA input validation in schema allowing FASTQ extension, expand allowed FASTA extensions (fixed by @jfy133)
23+
- [#512](https://github.com/nf-core/taxprofiler/pull/532) Minor formatting and ordering improvements in MultiQC report (by @jfy133)
24+
- [#532](https://github.com/nf-core/taxprofiler/pull/532) - Added missing documentation behind the 'ignore' BRACKEN_BRACKEN error strategy (❤️ to @Mavti for reporting, fixed by @jfy133)
25+
- [#536](https://github.com/nf-core/taxprofiler/pull/536) - Redefine `contents_re` for filtlong to fix its missing from the MultiQC report (fixed by @LilyAnderssonLee)
26+
27+
### `Dependencies`
28+
29+
| Tool | Previous version | New version |
30+
| --------- | ---------------- | ----------- |
31+
| bbmap | 39.01 | 39.06 |
32+
| bowtie2 | 2.4.4 | 2.5.2 |
33+
| bracken | 2.7 | 2.9 |
34+
| diamond | 2.0.15 | 2.1.8 |
35+
| ganon | 1.5.1 | 2.0.0 |
36+
| kraken2 | 2.1.2 | 2.1.3 |
37+
| krona | 2.8 | 2.8.1 |
38+
| megan | 6.24.20 | 6.25.9 |
39+
| metaphlan | 4.0.6 | 4.1.1 |
40+
| minimap2 | 2.24 | 2.28 |
41+
| motus | 3.0.3 | 3.1.0 |
42+
| multiqc | 1.21 | 1.25 |
43+
| samtools | 1.17 | 1.20 |
44+
45+
### `Deprecated`
46+
647
## v1.1.8 - Augmented Akita Patch [2024-06-20]
748

849
### `Added`

CITATIONS.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,14 +30,26 @@
3030

3131
> Schubert, M., Lindgreen, S., & Orlando, L. (2016). AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Research Notes, 9, 88. https://doi.org/10.1186/s13104-016-1900-2
3232
33+
- [Nonpareil](https://doi.org/10.1128/mSystems.00039-18)
34+
35+
- Rodriguez-R, L. M., Gunturu, S., Tiedje, J. M., Cole, J. R., & Konstantinidis, K. T. (2018). Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity. mSystems, 3(3). https://doi.org/10.1128/mSystems.00039-18
36+
3337
- [Porechop](https://github.com/rrwick/Porechop)
3438

3539
> Wick, R. R., Judd, L. M., Gorrie, C. L., & Holt, K. E. (2017). Completing bacterial genome assemblies with multiplex MinION sequencing. Microbial Genomics, 3(10), e000132. https://doi.org/10.1099/mgen.0.000132
3640
41+
- [Porechop_ABI](https://github.com/bonsai-team/Porechop_ABI)
42+
43+
> Bonenfant, Q., Noé, L., & Touzet, H. (2023). Porechop_ABI: discovering unknown adapters in Oxford Nanopore Technology sequencing reads for downstream trimming. Bioinformatics Advances, 3(1):vbac085. https://10.1093/bioadv/vbac085
44+
3745
- [Filtlong](https://github.com/rrwick/Filtlong)
3846

3947
> Wick R (2021) Filtlong, URL: https://github.com/rrwick/Filtlong
4048
49+
- [nanoq](https://github.com/esteinig/nanoq)
50+
51+
> Steinig, E., & Coin, L. (2022). Nanoq: ultra-fast quality control for nanopore reads. Journal of Open Source Software, 7(69). https://doi.org/10.21105/joss.02991
52+
4153
- [BBTools](http://sourceforge.net/projects/bbmap/)
4254

4355
> Bushnell B. (2022) BBMap, URL: http://sourceforge.net/projects/bbmap/

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,11 +29,11 @@
2929

3030
1. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) or [`falco`](https://github.com/smithlabcode/falco) as an alternative option)
3131
2. Performs optional read pre-processing
32-
- Adapter clipping and merging (short-read: [fastp](https://github.com/OpenGene/fastp), [AdapterRemoval2](https://github.com/MikkelSchubert/adapterremoval); long-read: [porechop](https://github.com/rrwick/Porechop))
33-
- Low complexity and quality filtering (short-read: [bbduk](https://jgi.doe.gov/data-and-tools/software-tools/bbtools/), [PRINSEQ++](https://github.com/Adrian-Cantu/PRINSEQ-plus-plus); long-read: [Filtlong](https://github.com/rrwick/Filtlong))
32+
- Adapter clipping and merging (short-read: [fastp](https://github.com/OpenGene/fastp), [AdapterRemoval2](https://github.com/MikkelSchubert/adapterremoval); long-read: [porechop](https://github.com/rrwick/Porechop), [Porechop_ABI](https://github.com/bonsai-team/Porechop_ABI))
33+
- Low complexity and quality filtering (short-read: [bbduk](https://jgi.doe.gov/data-and-tools/software-tools/bbtools/), [PRINSEQ++](https://github.com/Adrian-Cantu/PRINSEQ-plus-plus); long-read: [Filtlong](https://github.com/rrwick/Filtlong)), [Nanoq](https://github.com/esteinig/nanoq)
3434
- Host-read removal (short-read: [BowTie2](http://bowtie-bio.sourceforge.net/bowtie2/); long-read: [Minimap2](https://github.com/lh3/minimap2))
3535
- Run merging
36-
3. Supports statistics for host-read removal ([Samtools](http://www.htslib.org/))
36+
3. Supports statistics metagenome coverage estimation ([Nonpareil](https://nonpareil.readthedocs.io/en/latest/)) and for host-read removal ([Samtools](http://www.htslib.org/))
3737
4. Performs taxonomic classification and/or profiling using one or more of:
3838
- [Kraken2](https://ccb.jhu.edu/software/kraken2/)
3939
- [MetaPhlAn](https://huttenhower.sph.harvard.edu/metaphlan/)
@@ -73,7 +73,7 @@ Additionally, you will need a database sheet that looks as follows:
7373

7474
`databases.csv`:
7575

76-
```
76+
```csv
7777
tool,db_name,db_params,db_path
7878
kraken2,db2,--quick,/<path>/<to>/kraken2/testdb-kraken2.tar.gz
7979
metaphlan,db1,,/<path>/<to>/metaphlan/metaphlan_database/

assets/multiqc_config.yml

Lines changed: 128 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
report_comment: >
22
3-
This report has been generated by the <a href="https://github.com/nf-core/taxprofiler/releases/tag/1.1.8" target="_blank">nf-core/taxprofiler</a>
3+
This report has been generated by the <a href="https://github.com/nf-core/taxprofiler/releases/tag/1.2" target="_blank">nf-core/taxprofiler</a>
44
analysis pipeline. For information about how to interpret these results, please see the
5-
<a href="https://nf-co.re/taxprofiler/1.1.8/docs/output" target="_blank">documentation</a>.
5+
<a href="https://nf-co.re/taxprofiler/1.2/docs/output" target="_blank">documentation</a>.
66
77
report_section_order:
88
"nf-core-taxprofiler-methods-description":
@@ -11,6 +11,48 @@ report_section_order:
1111
order: -1001
1212
"nf-core-taxprofiler-summary":
1313
order: -1002
14+
general_stats":
15+
order: 1000
16+
fastqc:
17+
order: 900
18+
fastqc-1:
19+
order: 800
20+
fastp:
21+
order: 750
22+
adapterremoval:
23+
order: 700
24+
nonpareil:
25+
order: 600
26+
bbduk:
27+
order: 550
28+
prinseqplusplus:
29+
order: 500
30+
porechop:
31+
order: 450
32+
porechop_abi:
33+
order: 400
34+
filtlong:
35+
order: 350
36+
nanoq:
37+
order: 300
38+
bowtie2:
39+
order: 200
40+
samtools:
41+
order: 100
42+
kraken:
43+
order: 90
44+
bracken:
45+
order: 80
46+
centrifuge:
47+
order: 70
48+
malt:
49+
order: 60
50+
diamond:
51+
order: 50
52+
kaiju:
53+
order: 40
54+
motus:
55+
order: 30
1456

1557
export_plots: true
1658

@@ -21,14 +63,15 @@ custom_logo_title: "nf-core/taxprofiler"
2163

2264
run_modules:
2365
- fastqc
24-
- adapterRemoval
66+
- adapterremoval
2567
- fastp
68+
- nonpareil
2669
- bbduk
2770
- prinseqplusplus
2871
- porechop
2972
- filtlong
73+
- nanoq
3074
- bowtie2
31-
- minimap2
3275
- samtools
3376
- kraken
3477
- kaiju
@@ -39,11 +82,16 @@ run_modules:
3982

4083
sp:
4184
diamond:
42-
fn_re: ".*.diamond.log$"
85+
fn: "*.diamond.log"
4386
fastqc/data:
4487
fn_re: ".*(fastqc|falco)_data.txt$"
4588
fastqc/zip:
4689
fn: "*_fastqc.zip"
90+
nonpareil:
91+
fn: "nonpareil_all_samples.json"
92+
filtlong:
93+
contents: Scoring long reads
94+
contents_re: " "
4795

4896
top_modules:
4997
- "fastqc":
@@ -60,13 +108,23 @@ top_modules:
60108
path_filters_exclude:
61109
- "*raw*"
62110
extra: "If used in this run, Falco is a drop-in replacement for FastQC producing the same output, written by Guilherme de Sena Brandine and Andrew D. Smith."
63-
- "fastp"
64-
- "adapterRemoval"
111+
- nonpareil
65112
- "porechop":
113+
name: "Porechop"
114+
anchor: "porechop"
115+
target: "Porechop"
116+
path_filters:
117+
- "*porechop.log"
66118
extra: "ℹ️: if you get the error message 'Error - was not able to plot data.' this means that porechop did not detect any adapters and therefore no statistics generated."
67-
- "bbduk"
68-
- "prinseqplusplus"
69-
- "filtlong"
119+
- "porechop":
120+
name: "Porechop_ABI"
121+
anchor: "porechop_abi"
122+
target: "Porechop_ABI"
123+
doi: "10.1093/bioadv/vbac085"
124+
info: "find and remove adapters from Oxford Nanopore reads."
125+
path_filters:
126+
- "*porechop_abi.log"
127+
extra: "ℹ️: if you get the error message 'Error - was not able to plot data.' this means that porechop_abi did not detect any adapters and therefore no statistics generated."
70128
- "bowtie2":
71129
name: "bowtie2"
72130
- "samtools":
@@ -95,12 +153,11 @@ top_modules:
95153
- "*.centrifuge.txt"
96154
- "malt":
97155
name: "MALT"
98-
- "diamond"
99156
- "kaiju":
100157
name: "Kaiju"
101-
- "motus"
102158

103-
#It is not possible to set placement for custom kraken and centrifuge columns.
159+
# It is not possible to set placement for custom kraken
160+
# and centrifuge columns.
104161

105162
table_columns_placement:
106163
FastQC / Falco (pre-Trimming):
@@ -130,16 +187,33 @@ table_columns_placement:
130187
percent_aligned: 370
131188
percent_collapsed: 380
132189
percent_discarded: 390
190+
nonpareil:
191+
nonpareil_R: 400
192+
nonpareil_LR: 410
193+
nonpareil_kappa: 420
194+
nonpareil_C: 430
195+
nonpareil_diversity: 440
133196
Porechop:
134-
Input Reads: 400
135-
Start Trimmed: 410
136-
Start Trimmed Percent: 420
137-
End Trimmed: 430
138-
End Trimmed Percent: 440
139-
Middle Split: 450
140-
Middle Split Percent: 460
197+
Input Reads: 500
198+
Start Trimmed: 510
199+
Start Trimmed Percent: 520
200+
End Trimmed: 530
201+
End Trimmed Percent: 540
202+
Middle Split: 550
203+
Middle Split Percent: 560
204+
Porechop_ABI:
205+
Input Reads: 500
206+
Start Trimmed: 510
207+
Start Trimmed Percent: 520
208+
End Trimmed: 530
209+
End Trimmed Percent: 540
210+
Middle Split: 550
211+
Middle Split Percent: 560
141212
Filtlong:
142-
Target bases: 500
213+
Target bases: 600
214+
nanoq:
215+
Reads: 700
216+
Read N50: 710
143217
BBDuk:
144218
Input reads: 800
145219
Total Removed bases percent: 810
@@ -203,6 +277,24 @@ table_columns_visible:
203277
percent_duplicates: False
204278
percent_gc: False
205279
percent_fails: False
280+
Adapter Removal:
281+
aligned_total: True
282+
percent_aligned: True
283+
percent_collapsed: True
284+
percent_discarded: False
285+
fastp:
286+
pct_adapter: True
287+
pct_surviving: True
288+
pct_duplication: False
289+
after_filtering_gc_content: False
290+
after_filtering_q30_rate: False
291+
after_filtering_q30_bases: False
292+
nonpareil:
293+
nonpareil_R: false
294+
nonpareil_LR: false
295+
nonpareil_kappa: true
296+
nonpareil_C: true
297+
nonpareil_diversity: true
206298
porechop:
207299
Input reads: False
208300
Start Trimmed:
@@ -211,20 +303,19 @@ table_columns_visible:
211303
End Trimmed Percent: True
212304
Middle Split: False
213305
Middle Split Percent: True
214-
fastp:
215-
pct_adapter: True
216-
pct_surviving: True
217-
pct_duplication: False
218-
after_filtering_gc_content: False
219-
after_filtering_q30_rate: False
220-
after_filtering_q30_bases: False
306+
porechop_abi:
307+
Input reads: False
308+
Start Trimmed:
309+
Start Trimmed Percent: True
310+
End Trimmed: False
311+
End Trimmed Percent: True
312+
Middle Split: False
313+
Middle Split Percent: True
221314
Filtlong:
222315
Target bases: True
223-
Adapter Removal:
224-
aligned_total: True
225-
percent_aligned: True
226-
percent_collapsed: True
227-
percent_discarded: False
316+
nanoq:
317+
ReadN50: True
318+
Reads: True
228319
BBDuk:
229320
Input reads: False
230321
Total Removed bases Percent: False
@@ -276,6 +367,10 @@ extra_fn_clean_exts:
276367
- ".bbduk"
277368
- ".unmapped"
278369
- "_filtered"
370+
- "porechop"
371+
- "porechop_abi"
372+
- "_processed"
373+
- ".diamond"
279374
- type: remove
280375
pattern: "_falco"
281376

assets/schema_database.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,12 @@
3939
"errorMessage": "Invalid database db_params entry. No quotes allowed.",
4040
"meta": ["db_params"]
4141
},
42+
"db_type": {
43+
"type": "string",
44+
"enum": ["short", "long", "short;long"],
45+
"default": "short;long",
46+
"meta": ["db_type"]
47+
},
4248
"db_path": {
4349
"type": "string",
4450
"exists": true,

0 commit comments

Comments
 (0)