Releases: PlantandFoodResearch/TEFingerprint
Releases · PlantandFoodResearch/TEFingerprint
Alpha-0.3.2
v0.3.2
- alpha
- changes
- rename cluster classes and methods IDBCAN and SIDBCAN to DBICAN and SDBICAN respectively
- minor changes for compatibility with python 3.4
- added Makefile
- bug fixes
- simplify and fix travis CI
- documentation
- add MIT licence
- improve method docs
- add graphics to method docs #74
- document requirements
Alpha-0.3.1
v0.3.1
- alpha
- new features
- maximum count proportion # 115
- added 'max_count_proportion' field to output (was already used for determining colour)
- renamed '--no-colour' switch to '--no-max-count-proportion'
- maximum count proportion # 115
- changes
- package version now stored in single location 'version.py'
Alpha-0.3.0
- new features
- informative reads
- option to include soft-clipped tips in place of their mate informative read
- option to exclude full-length informative reads completely
- output formats
- extract-informative temporary files #106
- temporary files are now written to the same location as the output by default
- option to not remove these files
- option to write them to a temp location provided by the operating system
- informative reads
- changes
- added Biopython as a dependency to support block-gzip compression
- refactor submodules to hide application specific code
- refactor clustering classes/methods to use new names
- non-hierarchical method named IDBCAN (Interval Density Based Clustering of Applications with Noise)
- conservative-hierarchical method named SIDBCAN (Splitting-IDBCAN)
- aggressive-hierarchical method named SIDBCAN-aggressive and deprecated
- bug fixes
- documentation
- named clustering algorithms loosely based on DBSCAN
- non-hierarchical method named IDBCAN (Interval Density Based Clustering of Applications with Noise)
- hierarchical method named SIDBCAN (Splitting-IDBCAN)
- many corrections to documentation of clustering algorithms #99
- specify tabix command to index block-gzipped output
- named clustering algorithms loosely based on DBSCAN
Alpha-0.2.0
- new features
- new 'conservative' clustering method #91
- changes
- new 'conservative' clustering method used by default #91
- simplified clustering/splitting method selection with flag '--splitting-method'
- changed library imports to include loci, fingerprint, fingerprintio and cluster
- tidied cluster.py submodule code
- bug fixes
- documentation
- updated method.rst with new clustering method
- updated usage.rst with new clustering method
- updated CLI help text
- updated cluster.py docstrings with new method
- added metadata for PyPI
Alpha-0.1.4
Substantial changes to code base in order to accommodate new features.
Command-line-tool names and arguments have changed - see the documentation.
Core clustering algorithms are unchanged and will identify the same clusters/features as before.
- Renamed tools
- remove
tef
wrapper as it is likely to have a name collision with somethingtef fingerprint/compare
are combined into single tooltefingerprint
#63tef preprocess
is nowtef-extract-informative
tef filter-gff
is nowtef-filter-gff
(note: this tools behavior has changes substantially to handle the new gff output)
- remove
- Refactored modules
- core sub-module
loci.py
re-written to be more flexible #82- single data structure for for representing collections of loci
- arbitrary string length limit removed (use of python string in place numpy strings)
- split tool logic into separate sub-module
fingerprint.py
- core sub-module
- New Features
tefingerprint
- trim buffered clusters to extent of read tips
- count n most common elements per sample in each bin #63 #81
- use gff annotation for tagging known elements #80
- join paired clusters using gff annotation #78
- output files (gff, csv) are optionally pipe-able
- output files (gff, csv) contain more detailed data
- can read(anotation gff)/write files compressed with gzip or bz2 #86
- escape special characters in gff files with percent encodings
tef-filter-gff
- changed to handle new gff output #84
- use of
--any
and--all
contexts for combining filters - unix style wild cards for matching multiple fields
- read and write gz abd bz2 compressed files #86
- escape special characters with percent encodings
- read gff from standard in
Alpha-0.1.3
- Performance improvements:
- Fixes
- Documentation
- Split readme into multiple documents
- Switched to .rst for documentation
- Added description of methods
Alpha-0.1.2
Alpha-0.1.1
- Minor fixes for for cluster support calculation #53
- Default selection of starting epsilon value now matches Campello et al 2015
- Update terminology to reflect that used in Campello et al 2015
- Tests
Alpha-0.1.0
- Pre-Processing
- Corrections for reverse-complemented reads when extracting dangler reads
- Inclusion of soft-clipped sections from the outer end of proper-mapped pairs
- Separated mapping reads to repeats from pre-processing script
- Fingerprinting/Compare
- Quality score filtering options
- New output format options
- Other
- Bump version to 0.1.0
- Change status to alpha
- Updated documentation
- Additional tests
Tabular Data Structures
Improvements:
- Nicer data structures for use as a library #42
- Easy inter-op with Numpy, Pandas and Plotting libraries in python #42
- Improved buffering of comparative bins
- Multi-process pipelines return results to parent process rather than printing directly to file
- GFF output is sorted by chromosome, start position, stop position #32
- Added better examples to readme #40
- improved testing of fingerprint and compare data methods #6
- Compare output is no longer nested and contains normalised read counts #4 and is coloured by read count proportions #43
- filter_gff program is simplified (no longer deals with nested gff)
- renamed module to
tefingerprint
and CLI totef
#3