Releases: BuysDB/SingleCellMultiOmics
v0.1.22: Merge pull request #242 from BuysDB/chic_se
Chic single end read support and custom taps contig support
SingleCellMultiOmics 0.1.20
Merge pull request #231 from BuysDB/bwdif Added script to generate bw diff
SingleCellMultiOmics 0.1.19
This version contains a fix for a bug which caused QC-failed reads to sometimes get lost when multi-threading is enabled. Valid reads were not affected.
Early 2021
Contains many updates from 2020 and Jan 2021
SingleCellMultiOmics 0.1.17
Version 1.17 of singlecellmultiomics
SCMO 0.1.13
Version bump.
SCMO 0.1.12
All barcode files now start at index 1
BamTabulator: show available tags and pysam attributes
Added -max_handles setting to bamSplitByTag.py
Index bams using multiple threads
Added overseq count table generator
Extended help and added blacklist argument
Added functions to read and write blacklist to bam file header
Added timeout args
Write blacklisted regions to bam file
Change default strand inversion value
Use -p flag for samtools merge to prevent PN: dupes
Fall back on calling samtools binary, it works better with -c -p flags
Added gitignore folders
Added XA filter to bamFilter
Check validity of —contexts_to_capture
Fixed taps strand parameter bug, set taps strand default to R
Fix out of range error when using contexts_to_capture and molecule is on start of contig
SCMO 0.1.11
This release contains a bugfix which resolves molecules being under or over counted when using bamToCountTable.py --dedup
BuysDB (49):
Added 3bp context profiler
Added annotated chic molecule tagging method
Added consensus options to multiprocessing
Added context extraction method
Added covariate extraction methods
Added covariate_key generator
Added customisable offset for wig export
Added custom methylation contexts
Added DS-methylation extraction script
Added no-qcfail flag
Added prob_to_phred method
Added --r1only to chic workflow
Added recalibration functions
Added simple mutation profiler
Added -tagthreads parameter to control how many threads are used for tagging
Add options and tests to only count R1, or R2
Always use both mates
Check for min_mq not being defined
Check if all files are indexed
Cleaned up code and added examples
Clean up handles to prevent memory leaks
Clean up labels
Clip output confidences to 0-62 phred range
Close plots to reduce memory footprint
Create plots and aggregate by strand and mate
Fixed tag descriptions
Fix phred score calculation
Improved handling when passing over deletions. Refactoring
**Never count half counts on properly formatted bam files**
optimisations, do not perform pileups in single bp binsize mode, perform pruning in thread.
Parameter passing fixes, and dealing with some globals
Properly import the indexing function
Psuedoread super call
Raise error when wrong fasta file is supplied
Refactoring
Removed DS check, added verbosity flag, fixed indentation
Removed required reference path
Renamed bamFileTabulator in Readme to bamTabulator
Revert to mean normalisation when median fails
Run on a single node for slurm (-N 1)
Spring
SCMO v0.1.9
Summary:
BuysDB (331):
Updated download URL
Added get_samples_from_bam function
Added get_sample_to_read_group_dict functions
Added sample extraction script
Moved extract_samples method to function which can be unit tested
Added tests for extract_samples
Clean up all files created during testing. (Some .bai files remained)
Removed tf requirement
bugfix: no newline after map index rows
Added scartrace module to Molecule
Added scartrace module to Fragment
Import submodules
Added scartrace to bamtagmultiome
scartrace: Check if read is mapped before looking into the alignment
Added allele cache flag
Bamfilter: fixed header formatting
Use cigar in deduplication, fixed #84
Added test for #84
Use --no_umi_cigar_processing to disable the new behaviour
Fixed test case
Fixed test case file
Simplified sorted_bam handle
Removed use of getPairGenomicLocations and allowed fragments to decide their span.
Added get_safe_span() method to fragment which reports the span excluding primers
Added region parameters to AlleleResolver to reduce memory footprint
Added documentation
Also use region parameters when fetching from cache
Added pileup module
Added check_eject_every=None option to MoleculeIterator
Updated example
Fixed module reference
Version increment
Extract base-calls for fragments mapping to multiple contigs
Use the contig of the random primer in IVT deduplication
Pass kwargs to pysam pileup and set higher max_depth
Variant masking tool now runs for multiple contigs in parallel and will not crash when the VCF does not match the fasta file completely
Reading the vcf using 4 threads per process
Added support for non-properly paired reads
Added resolve_unproperly_paired_reads to bamtagmultiome
Set more decompression threads and fixed description
Set program ID tag in PG header line
The BI tag was used to identify the cell index, but it clashes with GATK. It is now changed to lowercase bi.
Added forwards compat
Added --slurm flag to submission.py.
Set job name
Fixed BI tag compat
Automatically convert BI to bi tag
Fixed bug accessing tag dict
Started work on slurm/sge/local wrapper
submission.py is now slurm compatible, added API to sumbit and hold jobs
Added scheduler selection argument to bamtagmultiome
fixed import
Removed references to args
Return job_id
Added slurm wrapper for snakemake
Added description to iterator class inputs
Fixed typo
Added legacy scripts
Made legacy scripts PEP8 compliant
Added job_name argument to submit_job
job_alias is now optional
Changed passed arg
Parse scientific notated locations during bed parsing
Use chromosome index in job script name
Perform explicit cd to working dir
Set job name of final job
Fixed job_name
Display id of last job
Show job ids of intermediate jobs
Use after: in slurm dependency submission
Addiotion to previous
Check for hold being None
Strip hold input
Use afterok instead of after
Strip job ids
Use one UUID for a single bamtagmultiome run
Show holding command
Concatenate all job ids in one dependency command
Use : as job separator
Changed argument order
Pass None to API when hold is empty
Set job name when using CLI
Swapped prefix and hard job name
job_alias
Added utf8 header
Generate random job name if not specified
Prefix job for sge compat
Typo fixes
Job name is now properly set when supplied. File names are timestamped if not specified.
Demux.py: create unique glue job name
Added script to match bam file with bqsr report
More descriptive error message when autodetection fails
Added memory management parameters for molecule iteration
Added parameters to Molecule to cap the amount of associated fragments
Fixed? the SLURM wrapper for snakemake workflows
Added slurm wrapper to setup
Parse job runtime from resources
Added SLURM command example to scmo_workflow.py
Added MUTECT2 workflow
Tweaked resources
Added first pass variant calling
sge and slurm wrapper now use the same API calls
Report job id
Set correct index name
Fixes #101
Extraction
Added germline variant filters
Some syntax fixes
Added germline filter message and header
SNV filter
Added extra uuid4
Write intermediate results
Added -filterMP flag to bamToCountTable
Double dash
Fixed tests
Added threads to bamcnv
Updated test cases with blacklist argument
Added CS2 demux without hexamer
Set class name
Added CELSeq2_c8_u6_NH to strat loader
Added test case for CELSEQ demux. Fixed hexamer setting of NH.
Fall back on using qsub when sbatch is not available
Added workflow for SCMO (not featurecounts) celseq2 analysis
Fixed exon gtf script name in description
Added capture_locations argument
Added hash function to SingleEndTranscript (speed benefit)
Re-ordered demux methods
Added genomic plot class
Added bamFeatures module
Indentation fix
Fixed broken indent
Updates for chic
Added script to split bamfile by tag
Added skip_contig option to bamtagmultiome
Added demux tests
Added compat for already demultiplexed index
Fixed cell-readcount plot
Added get_contig_size to bamprocessing utils
Fast multi-processing count table generation
FeatureCountsFullLengthFragment fragment class added
Fixed variable declaration
Added linting script
Added full length featurecounts dedup option to bamtagmultiome (fl_feature_counts)
Removed unused imports
Allow pysam.FastaFile as argument
Added method to reset axis of a contig
Swapped dictionary indexing
Added key_tags argument
Allow pysam handle
Scale axis and despine
Fixed ax reference
Added dedup option
Added genome coverage plot to library stats
Added bam_is_processed_by_program function
Autodetect which bam file should be used if not supplied
Added more arguments to configure memory limits
Dont use multiprocessing when one thread is requested
Added variant extraction to workflow
Added cn clustermap
Added more comments and only check sample when read is used
Make sure the contigs are in the correct order
Added live counting function
Added lowess count correction
Added script for extracting and plotting cn
Added progress indication
Fix print statement
Removed incorrect argument
Added missing cariage return
Bugfix: Check if gc matrix needs to be computed
Added max_fragment_size threshold
Added option to set a single read group sample id per library
Added option to allow shift in cycle
Write rejection reason tag
Added parameter to expose setting to allow cycle shift
Added read group format setting to bamtagmultiome
Added allow_cycle_shift to bamtagmultiome
allow_cycle_shift=False by default
Added test case and updated other test cases
Added overflow support to MoleculeIterator
Raise overflow error when too many fragments are being associated with a molecule
Added association limit parameters
Added callback function to MoleculeIterator to monitor progress and state
Added performance logging methods to bamtagmultiome
Correctly handle yield_invalid flag for overflow reads
Added yield_overflow parameter to MoleculeIterator
Added --no_overflow parameter to bamtagmultiome
Optimized ordering of progress indication and shows percentage deleted reads
Added verbosity settings
Added integrity status files and testing. Fixes #65
Added input_is_sorted argument
Refactored read group code
Added script to convert read group format of bam file
Made bamtagmultiome use the new read group protocol
Added get_read_group_from_read function to bamprocessing
Prevent duplicate program IDs
Added get_read_group_format function
Refactoring
Demux.py is now twice as fast.
Bugfix: pass keyword arguments in all Fragment classes
Fix kwargs
Set variant key to include ref and alt base
Added variants module
Added variant wrapper class which can be pickled
Start of postprocessing module
Added fast_compression flags to multiple functions
Added test case for writing with faster compression
Formatting
Added prototype bamtagmultiome script which uses multiple CPUs and automatically blacklists regions (scCHiC only for now)
Added more command line accessible arguments
Added functions to combine overlapping ranges
Added function to clip a list of regions between set boundaries
Added function to generate overlapping ranges excluding blacklisted regions
Added test case for blacklisted binning
Added blacklist option to bamtagmultiome_multi
Added statsmodels dependency
Minor tweaks
Bugfix: assume average GC for a region with only Ns in the reference sequence
Bugfixes
Added min_mapping_qual and debug_job_bin_bed arguments
Added min_mapping_qual to molecule iterator
Bugfix, always define total_com...
Crow
Added:
Script to split bam by cluster
Feature counts compatibility with bamtagmultiome
CHIC+T experimental
Added method nla_no_overhang method to bamtagmultiome
Added CellReadCount plot to libraryStatistics.py
Added tensorflow based consensus caller
Bugfixes:
Try sorting at multiple locations
All submodules now have an entry in the docs
Library statistics: fixed bug where tagged.bam was not detected
And updated version name.