Skip to content

Commit 8d66ebd

Browse files
suhrigpinin4fjordsclaude
authored
plastid metagene_generate, make_wiggle, psite (#9482)
* plastid metagene_generate, make_wiggle, psite * pair BAM and BAI files * pair bam and p_offsets * omit optional arguments * metagene generate accepts various input formats * add meta * do not remove variable headers from output files * warning about hard-coded version * make lint happy * make lint happy #2 * plastid/make_wiggle: nf-core standards compliance - Add mapping_rule val input (enum: fiveprime, threeprime, center, fiveprime_variable) - Move output_format to ext.args (optional arg per nf-core standards) - Add validation: error if p_offsets missing with fiveprime_variable - Remove hardcoded --fiveprime_variable - Update meta.yml with mapping_rule input and enum - Update tests with mapping_rule input 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * adapt meta.yml to new parameters * plastid: consolidate test snapshots and fix reproducibility - Consolidate multiple snapshot assertions into single snapshots per test - Remove snapshots of empty stub files (just check existence) - Exclude non-reproducible PNG from psite snapshots (matplotlib drift) - Format metagene_generate command across multiple lines 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * plastid/make_wiggle: remove tracks from snapshot Wig files have non-reproducible md5sums across environments. Content is already validated via getText().contains('track'). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * plastid/psite: remove non-reproducible outputs from snapshot metagene_profiles.txt and p_offsets.txt have non-reproducible md5sums. Content is already validated via getText().contains() checks. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Jonathan Manning <jonathan.manning@seqera.io> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 69614d4 commit 8d66ebd

File tree

15 files changed

+699
-0
lines changed

15 files changed

+699
-0
lines changed
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json
3+
channels:
4+
- conda-forge
5+
- bioconda
6+
dependencies:
7+
- plastid=0.6.1
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
process PLASTID_MAKE_WIGGLE {
2+
tag "$meta.id"
3+
label "process_single"
4+
5+
// WARN: Version information not provided by tool on CLI. Please update version string below when bumping container versions.
6+
conda "${moduleDir}/environment.yml"
7+
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
8+
'https://depot.galaxyproject.org/singularity/plastid:0.6.1--py39had3e4b6_2':
9+
'biocontainers/plastid:0.6.1--py39had3e4b6_2' }"
10+
11+
input:
12+
tuple val(meta), path(bam), path(bam_index), path(p_offsets)
13+
val(mapping_rule)
14+
15+
output:
16+
tuple val(meta), path("*.{wig,bedgraph}"), emit: tracks
17+
path "versions.yml" , emit: versions
18+
19+
when:
20+
task.ext.when == null || task.ext.when
21+
22+
script:
23+
if (mapping_rule == 'fiveprime_variable' && !p_offsets) {
24+
error "p_offsets file is required when using mapping_rule 'fiveprime_variable'"
25+
}
26+
def prefix = task.ext.prefix ?: "${meta.id}"
27+
def args = task.ext.args ?: ""
28+
def offset_arg = mapping_rule == 'fiveprime_variable' ? "--offset $p_offsets" : ""
29+
def extension = args.contains('--output_format bedgraph') ? "bedgraph" : "wig"
30+
def VERSION = "0.6.1" // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions.
31+
"""
32+
make_wiggle \\
33+
--count_files "$bam" \\
34+
$offset_arg \\
35+
--${mapping_rule} \\
36+
-o "$prefix" \\
37+
$args
38+
39+
if [ "$extension" = "bedgraph" ]; then
40+
for FILE in *.wig; do
41+
mv "\$FILE" "\${FILE%.wig}.bedgraph"
42+
done
43+
fi
44+
45+
cat <<-END_VERSIONS > versions.yml
46+
"${task.process}":
47+
plastid: $VERSION
48+
END_VERSIONS
49+
"""
50+
51+
stub:
52+
if (mapping_rule == 'fiveprime_variable' && !p_offsets) {
53+
error "p_offsets file is required when using mapping_rule 'fiveprime_variable'"
54+
}
55+
def prefix = task.ext.prefix ?: "${meta.id}"
56+
def args = task.ext.args ?: ""
57+
def extension = args.contains('--output_format bedgraph') ? "bedgraph" : "wig"
58+
def VERSION = "0.6.1" // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions.
59+
"""
60+
touch ${prefix}_fw.${extension}
61+
touch ${prefix}_rc.${extension}
62+
63+
cat <<-END_VERSIONS > versions.yml
64+
"${task.process}":
65+
plastid: $VERSION
66+
END_VERSIONS
67+
"""
68+
}
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json
2+
name: "plastid_make_wiggle"
3+
description: Create wiggle or bedGraph files from alignment files after applying
4+
a read mapping rule (e.g. to map ribosome-protected footprints at their
5+
P-sites), for visualization in a genome browser
6+
keywords:
7+
- genomics
8+
- riboseq
9+
- psite
10+
- wiggle
11+
- bedgraph
12+
tools:
13+
- "plastid":
14+
description: "Nucleotide-resolution analysis of next-generation sequencing and
15+
genomics data"
16+
homepage: "https://github.com/joshuagryphon/plastid"
17+
documentation: "https://plastid.readthedocs.io/en/latest/"
18+
tool_dev_url: "https://github.com/joshuagryphon/plastid"
19+
doi: "10.1186/s12864-016-3278-x"
20+
licence: ["BSD-3-clause"]
21+
identifier: biotools:plastid
22+
23+
input:
24+
- - meta:
25+
type: map
26+
description: |
27+
Groovy Map containing sample information
28+
e.g. `[ id:'sample1', single_end:false ]`
29+
- bam:
30+
type: file
31+
description: Genome BAM file
32+
pattern: "*.bam"
33+
ontologies:
34+
- edam: http://edamontology.org/format_2572 # BAM
35+
- bam_index:
36+
type: file
37+
description: Genome BAM index file
38+
pattern: "*.bai"
39+
ontologies:
40+
- edam: http://edamontology.org/format_3327 # BAI
41+
- p_offsets:
42+
type: file
43+
description: |
44+
Selected p-site offset for each read length (output from plastid_psite).
45+
Required when mapping_rule is 'fiveprime_variable', otherwise pass empty list [].
46+
pattern: "*_p_offsets.txt"
47+
ontologies: []
48+
- mapping_rule:
49+
type: string
50+
description: |
51+
Read mapping rule. Use 'fiveprime_variable' with p_offsets file to map reads to P-sites.
52+
enum: ["fiveprime", "threeprime", "center", "fiveprime_variable"]
53+
output:
54+
tracks:
55+
- - meta:
56+
type: map
57+
description: |
58+
Groovy Map containing sample information
59+
e.g. `[ id:'sample1', single_end:false ]`
60+
- "*.{wig,bedgraph}":
61+
type: file
62+
description: wig/bedgraph tracks for forward and reverse strands
63+
pattern: "*.{wig,bedgraph}"
64+
ontologies:
65+
- edam: http://edamontology.org/format_3005 # wig
66+
- edam: http://edamontology.org/format_3583 # bedgraph
67+
versions:
68+
- versions.yml:
69+
type: file
70+
description: File containing software versions
71+
pattern: "versions.yml"
72+
ontologies:
73+
- edam: http://edamontology.org/format_3750 # YAML
74+
authors:
75+
- "@suhrig"
76+
maintainers:
77+
- "@suhrig"
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
// nf-core modules test plastid
2+
nextflow_process {
3+
4+
name "Test Process PLASTID_MAKE_WIGGLE"
5+
script "../main.nf"
6+
process "PLASTID_MAKE_WIGGLE"
7+
8+
tag "modules"
9+
tag "modules_nfcore"
10+
tag "plastid"
11+
tag "plastid/make_wiggle"
12+
13+
test("human chr20 bam - fiveprime_variable") {
14+
15+
when {
16+
process {
17+
"""
18+
input[0] = [
19+
[ id:'SRX11780887' ], // meta map
20+
file(params.modules_testdata_base_path + "genomics/homo_sapiens/riboseq_expression/aligned_reads/SRX11780887_chr20.bam", checkIfExists: true),
21+
file(params.modules_testdata_base_path + "genomics/homo_sapiens/riboseq_expression/aligned_reads/SRX11780887_chr20.bam.bai", checkIfExists: true),
22+
file(params.modules_testdata_base_path + "genomics/homo_sapiens/riboseq_expression/plastid/SRX11780887_p_offsets.txt", checkIfExists: true)
23+
]
24+
input[1] = 'fiveprime_variable'
25+
"""
26+
}
27+
}
28+
29+
then {
30+
assertAll(
31+
{ assert process.success },
32+
{ assert path(process.out.tracks.get(0).get(1).get(0)).getText().contains('track') },
33+
{ assert snapshot(process.out.versions).match() }
34+
)
35+
}
36+
}
37+
38+
test("human chr20 bam - fiveprime_variable - stub") {
39+
40+
options "-stub"
41+
42+
when {
43+
process {
44+
"""
45+
input[0] = [
46+
[ id:'SRX11780887' ], // meta map
47+
file(params.modules_testdata_base_path + "genomics/homo_sapiens/riboseq_expression/aligned_reads/SRX11780887_chr20.bam", checkIfExists: true),
48+
file(params.modules_testdata_base_path + "genomics/homo_sapiens/riboseq_expression/aligned_reads/SRX11780887_chr20.bam.bai", checkIfExists: true),
49+
file(params.modules_testdata_base_path + "genomics/homo_sapiens/riboseq_expression/plastid/SRX11780887_p_offsets.txt", checkIfExists: true)
50+
]
51+
input[1] = 'fiveprime_variable'
52+
"""
53+
}
54+
}
55+
56+
then {
57+
assertAll(
58+
{ assert process.success },
59+
{ assert file(process.out.tracks.get(0).get(1).get(0)).exists() },
60+
{ assert snapshot(process.out.versions).match() }
61+
)
62+
}
63+
}
64+
65+
}
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
{
2+
"human chr20 bam - fiveprime_variable": {
3+
"content": [
4+
[
5+
"versions.yml:md5,4ecd455575ab11a0ad86dc5e0a6c6959"
6+
]
7+
],
8+
"meta": {
9+
"nf-test": "0.9.2",
10+
"nextflow": "25.10.0"
11+
},
12+
"timestamp": "2025-12-10T12:36:00.352824"
13+
},
14+
"human chr20 bam - fiveprime_variable - stub": {
15+
"content": [
16+
[
17+
"versions.yml:md5,4ecd455575ab11a0ad86dc5e0a6c6959"
18+
]
19+
],
20+
"meta": {
21+
"nf-test": "0.9.2",
22+
"nextflow": "25.10.0"
23+
},
24+
"timestamp": "2025-12-10T12:26:10.353895"
25+
}
26+
}
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json
3+
channels:
4+
- conda-forge
5+
- bioconda
6+
dependencies:
7+
- plastid=0.6.1
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
process PLASTID_METAGENE_GENERATE {
2+
tag "$annotation"
3+
label "process_low"
4+
5+
// WARN: Version information not provided by tool on CLI. Please update version string below when bumping container versions.
6+
conda "${moduleDir}/environment.yml"
7+
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
8+
'https://depot.galaxyproject.org/singularity/plastid:0.6.1--py39had3e4b6_2':
9+
'biocontainers/plastid:0.6.1--py39had3e4b6_2' }"
10+
11+
input:
12+
tuple val(meta), path(annotation)
13+
14+
output:
15+
tuple val(meta), path("*_rois.txt"), emit: rois_txt
16+
tuple val(meta), path("*_rois.bed"), emit: rois_bed
17+
path "versions.yml", emit: versions
18+
19+
when:
20+
task.ext.when == null || task.ext.when
21+
22+
script:
23+
def args = task.ext.args ?: ''
24+
def VERSION = "0.6.1" // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions.
25+
"""
26+
metagene generate \\
27+
"${annotation.baseName}" \\
28+
--annotation_files "$annotation" \\
29+
$args
30+
31+
cat <<-END_VERSIONS > versions.yml
32+
"${task.process}":
33+
plastid: $VERSION
34+
END_VERSIONS
35+
"""
36+
37+
stub:
38+
def VERSION = "0.6.1" // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions.
39+
"""
40+
touch ${annotation.baseName}_rois.txt
41+
touch ${annotation.baseName}_rois.bed
42+
43+
cat <<-END_VERSIONS > versions.yml
44+
"${task.process}":
45+
plastid: $VERSION
46+
END_VERSIONS
47+
"""
48+
}
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json
2+
name: "plastid_metagene_generate"
3+
description: Compute a metagene profile of read alignments, counts, or
4+
quantitative data over one or more regions of interest, optionally
5+
applying a mapping rule
6+
keywords:
7+
- genomics
8+
- riboseq
9+
- psite
10+
tools:
11+
- "plastid":
12+
description: "Nucleotide-resolution analysis of next-generation sequencing and genomics data"
13+
homepage: "https://github.com/joshuagryphon/plastid"
14+
documentation: "https://plastid.readthedocs.io/en/latest/"
15+
tool_dev_url: "https://github.com/joshuagryphon/plastid"
16+
doi: "10.1186/s12864-016-3278-x"
17+
licence: ["BSD-3-clause"]
18+
identifier: biotools:plastid
19+
20+
input:
21+
- - meta:
22+
type: map
23+
description: |
24+
Map containing reference information for the reference genome annotation file
25+
e.g. `[ id:'Ensembl human v.111' ]`
26+
- annotation:
27+
type: file
28+
description: annotation file of reference genome (BED, bigBed, GTF, GFF3)
29+
pattern: "*.{bed,bigbed,gtf,gff3}"
30+
ontologies:
31+
- edam: http://edamontology.org/format_3003 # BED
32+
- edam: http://edamontology.org/format_3004 # bigBed
33+
- edam: http://edamontology.org/format_2306 # GTF
34+
- edam: http://edamontology.org/format_1975 # GFF3
35+
output:
36+
rois_txt:
37+
- - meta:
38+
type: map
39+
description: |
40+
Map containing reference information for the reference genome annotation file
41+
e.g. `[ id:'Ensembl human v.111' ]`
42+
- "*_rois.txt":
43+
type: file
44+
description: Tab-delimited text file describing the maximal spanning
45+
window for each gene
46+
pattern: "*_rois.txt"
47+
ontologies: []
48+
rois_bed:
49+
- - meta:
50+
type: map
51+
description: |
52+
Map containing reference information for the reference genome annotation file
53+
e.g. `[ id:'Ensembl human v.111' ]`
54+
- "*_rois.bed":
55+
type: file
56+
description: |
57+
Maximal spanning windows in BED format for visualization in a genome
58+
browser. The thickly-rendered portion of a window indicates its landmark.
59+
pattern: "*_rois.bed"
60+
ontologies:
61+
- edam: http://edamontology.org/format_3003 # BED
62+
versions:
63+
- versions.yml:
64+
type: file
65+
description: File containing software versions
66+
pattern: "versions.yml"
67+
ontologies:
68+
- edam: http://edamontology.org/format_3750 # YAML
69+
authors:
70+
- "@suhrig"
71+
maintainers:
72+
- "@suhrig"

0 commit comments

Comments
 (0)