Releases: xiaoli-dong/pathogenseq
pathogenseq v1.1.3
Software version update
-
bakta (1.9.2 -> 1.9.4)
-
ncbi-amrfinderplus ( 3.11.18 -> 3.12.8)
-
Tbprofiler (5.0.1 to 6.2.1)
-
AMRFinderPlus Database Issue: Recently, Pathogenseq failed due to a schema change in the AMRFinderPlus database. This issue occurred because the database was being updated to the latest version each time we run the amrfinderplus program, while the program itself had not been updated to match the new schema. To resolve this, I’ve added a database path to the program. When a specific database path is provided, AMRFinderPlus will no longer attempt to download the database automatically. This change gives us better control over which database versions are used.
-
TBProfiler Update (v5.0.1 → v6.2.1): After updating TBProfiler to version 6.2.1, it no longer generates the expected JSON file. This caused an error message stating that the tbprofiler_illumina.json file was missing. I’ve updated the TBProfiler collate module in Pathogenseq v1.3.1 to fix this issue, and the error message should no longer appear.
pathogenseq v1.1.0
- Fixed a bug associated with Polypolish and polca short read polish process. This was caused by the mis-association of the input reads and contigs during the mapping. It means the reads mapped to the contigs from the other samples.
- Fixed a bug associated with the long reads assembly. The pipeline would crash when there were no short reads available.
- Updated the versions of the following tools:
- samtools (1.16.1 -> 1.19.2)
- fastqc (0.11.9->0.12.1)
- mlst (2.19.0 -> 2.23.0)
- bakta (1.6.1->1.9.2)
- nanoplot(1.41.0->1.41.6)
pathogenseq v1.0.4
Added de-host module using hostile. By default, the de-hosting function is enabled. You can disable or enable function using the following command line options.
- " --skip_illumina_dehost true"
- "--skip_nanopore_dehost true"
Added minimum data in base pair requirement command line options for proceeding to the assembly step:
- "--min_tbp_for_assembly_illumina", the default value is 1000000 bp
- "--min_tbp_for_assembly_nanopore", default value is 1000000 bp
Added pathogen specific tools and you can disable or enable the tools using command line options: "
- emmtyper for streptococcus pyogenes emm typing. you can disable or enable the function through command line options "--skip_emmtyper false". By default, the function is disabled.
- tb-profiler for Mycobacterium tuberculosis whole genome analysis. you can disable or enable the function through command line options "--skip_tbprofiler false". By default, the function is disabled.
- pneumocat for Streptococcus pneumoniae capsular typing. you can disable or enable the function through command line options "--skip_peuumocat false". By default, the function is disabled.
- GBS-SBG tools for Group B Streptococcus serotyping. you can disable or enable the function through command line options "--skip_gbssbg false". By default, the function is disabled.
When there are not enough input data, some of the tools (dnaapler, shovill, pneumocat, tbprofiler ) included in the pipeline will produce error instead of exit nicely. this will cause the whole pipeline aborting. I change the error handling strategy to ignore, with this strategy, when the tools failed, it will issue warning message and proceed to the next sample or next step of the pipeline instead of the whole pipeline failing.
pathogenseq v1.0.3
- Assembly stats were generated using "seqkit stats" and changed the program to assembly-stats
- In the read stats table, changed the first column to sample id instead of using sequence file name.
- Added mob_typer produced contig_report.txt files, which describe the assignment of the contig to chromosome or a particular plasmid grouping.
pathogenseq v1.0.2
- Reformatted the output sample id: get rid of "_T" suffix
- Added shovill assembler, which can do down sampling
pathogenseq v1.0.1
- Added assembled contig depth information by mapping qc reads to contigs
- Changed the samplesheet from seven columns to five columns, removed "genome_size" and "fast5" columns
- Changed basecaller_mode value from "fast|hac|sup" to medaka allowed model value, such as "r1041_e82_400bps_sup_v4.2.0"
- Upadted nextflow_schema files to make the pipeline command line options are more readable