Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC using module classes #1

Closed
wants to merge 26 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
a409347
install class expanded subworkflows
mirpedrol Aug 28, 2024
8b6c879
use class subworkflows
mirpedrol Aug 28, 2024
a357e15
remove old - not needed - modules and subworkflows
mirpedrol Aug 28, 2024
4f6cbbc
update all subworkflows
mirpedrol Nov 7, 2024
0b6bdab
Merge pull request #2 from mirpedrol/update-class-subworkflows
mirpedrol Nov 7, 2024
57c4e82
move evaluation subworkflow to an independent workflow
mirpedrol Nov 7, 2024
e2b92c1
nextflow language server fixes and channel combination fixes
mirpedrol Nov 11, 2024
3476d1f
remove toolsheet
mirpedrol Nov 11, 2024
e4dd23f
add parameters to provide tool arguments and save results per tool+ar…
mirpedrol Nov 12, 2024
8969f58
update methodsDescriptionText() according to the v3 template
mirpedrol Nov 12, 2024
e6534cc
fix Nf language server errors and make evaluation workflow work
mirpedrol Nov 18, 2024
056d141
read previous downstream samplesheets and start updating schemas
mirpedrol Nov 18, 2024
be314b4
update to nf-schema and read existing samplesheet to generate downstr…
mirpedrol Nov 18, 2024
79a714f
more fixes to make downstream samplesheet work + start adding params …
mirpedrol Nov 19, 2024
0ce16ed
either alignment or guidetree + treealign
mirpedrol Nov 19, 2024
d5810a8
not compress files + fixes to make the evaluation workflow work
mirpedrol Nov 20, 2024
f0e8e0a
add EXTRACT_STRUCTURES swf and fix downstream samplesheet
mirpedrol Nov 26, 2024
529d22b
remove reference_genome_options from schema
mirpedrol Nov 26, 2024
88e1a9c
Merge pull request #3 from mirpedrol/evaluation-to-workflow
mirpedrol Nov 26, 2024
7677fe2
update tcoffee/alncompare
mirpedrol Nov 27, 2024
4371fb7
update kalign/align
mirpedrol Nov 27, 2024
55e7b75
update mafft
mirpedrol Nov 27, 2024
9bae6c1
update modules
mirpedrol Nov 27, 2024
8acb2c9
update all modules
mirpedrol Nov 27, 2024
86ad6e7
allow compressed msa for evaluation
mirpedrol Nov 27, 2024
4300779
Merge pull request #4 from mirpedrol/update-modules
mirpedrol Nov 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@ repository_type: pipeline
nf_core_version: "2.14.1"
lint:
multiqc_config: False
files_exist: conf/igenomes.config
files_exist:
- conf/igenomes.config
19 changes: 0 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,24 +54,6 @@ Each row represents a set of sequences (in this case the seatoxin and toxin prot
> [!NOTE]
> The only required input is the id column and either fasta or structures.

#### 2. TOOLSHEET

Each line of the toolsheet defines a combination of guide tree and multiple sequence aligner to run with the respective arguments to be used.

It should look at follows:

`toolsheet.csv`:

```csv
tree,args_tree,aligner,args_aligner,
FAMSA, -gt upgma -medoidtree, FAMSA,
, ,TCOFFEE,
FAMSA,,REGRESSIVE,
```

> [!NOTE]
> The only required input is aligner.

#### 3. RUN THE PIPELINE

Now, you can run the pipeline using:
Expand All @@ -80,7 +62,6 @@ Now, you can run the pipeline using:
nextflow run nf-core/multiplesequencealign \
-profile test \
--input samplesheet.csv \
--tools toolsheet.csv \
--outdir outdir
```

Expand Down
67 changes: 67 additions & 0 deletions assets/schema_evaluate.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://raw.githubusercontent.com/nf-core/multiplesequencealign/master/assets/schema_evaluate.json",
"title": "nf-core/multiplesequencealign pipeline - schema for the evaluation workflow",
"description": "Schema for the evaluation workflow",
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "string",
"pattern": "^\\S+$",
"errorMessage": "Sample name must be provided and cannot contain spaces",
"meta": ["id"]
},
"alignment": {
"type": "string",
"description": "the alignment tool from params.alignment",
"meta": ["alignment"]
},
"alignment_args": {
"type": "string",
"description": "the alignment arguments from params.alignment_args",
"meta": ["alignment_args"]
},
"guidetree": {
"type": "string",
"description": "the guidetree tool from params.guidetree",
"meta": ["guidetree"]
},
"guidetree_args": {
"type": "string",
"description": "the guidetree arguments from params.guidetree_args",
"meta": ["guidetree_args"]
},
"treealign": {
"type": "string",
"description": "the treealign tool from params.treealign",
"meta": ["treealign"]
},
"treealign_args": {
"type": "string",
"description": "the treealign arguments from params.treealign_args",
"meta": ["treealign_args"]
},
"msa": {
"type": "string",
"format": "file-path",
"pattern": "^\\S+\\.aln(\\.gz)?$",
"description": "aln file containing the MSA",
"errorMessage": "Must end with .aln or .aln.gz",
"meta": ["msa"]
},
"reference": {
"type": "string",
"format": "file-path",
"meta": ["reference"]
},
"structures": {
"type": "string",
"format": "file-path",
"meta": ["structures"]
}
},
"anyOf": [{ "required": ["id", "msa"] }, { "required": ["id", "structures"] }]
}
}
16 changes: 10 additions & 6 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"$schema": "http://json-schema.org/draft-07/schema",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://raw.githubusercontent.com/nf-core/multiplesequencealign/master/assets/schema_input.json",
"title": "nf-core/multiplesequencealign pipeline - params.input schema",
"description": "Schema for the file provided with params.input",
Expand All @@ -15,20 +15,24 @@
},
"fasta": {
"type": "string",
"format": "file-path",
"pattern": "^\\S+\\.f(ast)?a$",
"errorMessage": "fasta file. Must end with .fa or .fasta"
},
"reference": {
"type": "string"
"type": "string",
"format": "file-path"
},
"structures": {
"type": "string"
"dependencies": {
"type": "string",
"format": "file-path"
},
"template": {
"type": "string"
"type": "string",
"format": "file-path"
}
},
"required": ["id"],
"anyOf": [{ "required": ["fasta"] }, { "required": ["structures"] }]
"anyOf": [{ "required": ["fasta"] }, { "required": ["dependencies"] }]
}
}
27 changes: 27 additions & 0 deletions assets/schema_stats.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://raw.githubusercontent.com/nf-core/multiplesequencealign/master/assets/schema_stats.json",
"title": "nf-core/multiplesequencealign pipeline - schema",
"description": "Schema for the stats file",
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "string",
"pattern": "^\\S+$",
"errorMessage": "Sample name must be provided and cannot contain spaces",
"meta": ["id"]
},
"stats": {
"type": "string",
"format": "file-path",
"pattern": "^\\S+\\.csv$",
"description": "dsv file containing the stats of the input sequences.",
"errorMessage": "Must end with .csv",
"meta": ["stats"]
}
},
"required": ["id", "stats"]
}
}
31 changes: 0 additions & 31 deletions assets/schema_tools.json

This file was deleted.

34 changes: 17 additions & 17 deletions conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@

process {

cpus = { check_max( 1 * task.attempt, 'cpus' ) }
memory = { check_max( 6.GB * task.attempt, 'memory' ) }
time = { check_max( 4.h * task.attempt, 'time' ) }
cpus = { 1 }
memory = { 6.GB }
time = { 4.h }

errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'finish' }
maxRetries = 1
Expand All @@ -25,30 +25,30 @@ process {
// adding in your local modules too.
// See https://www.nextflow.io/docs/latest/config.html#config-process-selectors
withLabel:process_single {
cpus = { check_max( 1 , 'cpus' ) }
memory = { check_max( 6.GB * task.attempt, 'memory' ) }
time = { check_max( 4.h * task.attempt, 'time' ) }
cpus = { 1 }
memory = { 6.GB }
time = { 4.h }
}
withLabel:process_low {
cpus = { check_max( 2 * task.attempt, 'cpus' ) }
memory = { check_max( 12.GB * task.attempt, 'memory' ) }
time = { check_max( 4.h * task.attempt, 'time' ) }
cpus = { 2 }
memory = { 12.GB }
time = { 4.h }
}
withLabel:process_medium {
cpus = { check_max( 6 * task.attempt, 'cpus' ) }
memory = { check_max( 36.GB * task.attempt, 'memory' ) }
time = { check_max( 8.h * task.attempt, 'time' ) }
cpus = { 6 }
memory = { 36.GB }
time = { 8.h }
}
withLabel:process_high {
cpus = { check_max( 12 * task.attempt, 'cpus' ) }
memory = { check_max( 72.GB * task.attempt, 'memory' ) }
time = { check_max( 16.h * task.attempt, 'time' ) }
cpus = { 12 }
memory = { 72.GB }
time = { 16.h }
}
withLabel:process_long {
time = { check_max( 20.h * task.attempt, 'time' ) }
time = { 20.h }
}
withLabel:process_high_memory {
memory = { check_max( 200.GB * task.attempt, 'memory' ) }
memory = { 200.GB }
}
withLabel:error_ignore {
errorStrategy = 'ignore'
Expand Down
Loading