-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to Nextclade v3 and Enhancement of Coverage and Depth Analysis #7
Conversation
…improved workflow efficiency - Introduce new scripts for calculating percent coverage (calc_percent_cov.py), merging BAM coverage (merge_bam_coverage.py), and related utilities to enhance coverage analysis and result integration. - Add Nextflow modules for segment coverage analysis, merging BAM coverage results, and other related tasks to streamline workflow processes. - Update nextflow.config and subworkflow modules for assembly typing and input checking to align with new scripts and enhance overall workflow efficiency and accuracy.
updated version to 2.0.0
Python linting (
|
|
Is
worrisome? This prints out when I ran the workflow using your test profile. Also, you probably want to remove something from your config file because there's this warning:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the code changes and didn't see any issues and I ran a test run which produced the expected output.
Changes and modifications proposed in this Pull Request:
Nextclade Modules Updates:
Updated the NEXTCLADE_VARIABLES, NEXTCLADE_DATASETGET, and NEXTCLADE_RUN modules to Nextclade version 3.0 to leverage the latest enhancements and bug fixes.
New Modules for Segment Coverage Analysis found in the assembly_typing_clade_variables subworkflow:
IRMA_SEGMENT_COVERAGE: Calculates reference length, sequence length, and percent coverage for each segment for a given sample. Output files are placed in the ‘irma_segment_coverage’ directory.
MERGE_COVERAGE_RESULTS: Aggregates coverage data across all samples into merged_coverage_results.tsv file placed in the ‘irma_segment_coverage’ directory.
New Modules for Read Mapping Analysis found in the assembly_typing_clade_variables subworkflow:
SAMTOOLS_MAPPED_READS: Calculates the number of reads mapped and mean depth for each segment for a given sample. Output files are placed in the ‘samtools_mapped_reads’ directory.
MERGE_BAM_RESULTS: Aggregates depth analysis data across all samples into merged_bam_results.tsv file placed in the ‘samtools_mapped_reads’ directory.
Report Generation found in the assembly_typing_clade_variables subworkflow:
MERGE_BAM_COVERAGE_RESULTS: This module compiles a comprehensive report from the merged_bam_results.tsv and merged_coverage_results.tsv files. The final report, merged_bam_coverage_results.tsv, is placed in the ‘SUMMARY_REPORTS’ directory.
Configuration Fixes:
Corrected the manifest version within nextflow.config to resolve the "Wrong version printed at runtime" issue.
Additional Notes:
The documentation has been updated to reflect these changes.