Releases · PlantandFoodResearch/TEFingerprint

31 May 02:04

timothymillar

v0.3.2

3e81275

Alpha-0.3.2 Latest

Latest

v0.3.2

alpha
changes
- rename cluster classes and methods IDBCAN and SIDBCAN to DBICAN and SDBICAN respectively
- minor changes for compatibility with python 3.4
- added Makefile
bug fixes
- simplify and fix travis CI
documentation
- add MIT licence
- improve method docs
- add graphics to method docs #74
- document requirements

Assets 2

21 Sep 03:56

timothymillar

v0.3.1

4ad7c8e

Alpha-0.3.1

v0.3.1

alpha
new features
- maximum count proportion # 115
  - added 'max_count_proportion' field to output (was already used for determining colour)
  - renamed '--no-colour' switch to '--no-max-count-proportion'
changes
package version now stored in single location 'version.py'

Assets 2

15 Jul 22:15

timothymillar

v0.3.0

15b49b6

Alpha-0.3.0

new features
- informative reads
  - option to include soft-clipped tips in place of their mate informative read
  - option to exclude full-length informative reads completely
- output formats
  - outputs with the .gz extension are compressed with block-gzip rather than regular gzip #104
  - option to output tab-delimited plain text files suitable for tabix indexing when block-gzipped #103
- extract-informative temporary files #106
  - temporary files are now written to the same location as the output by default
  - option to not remove these files
  - option to write them to a temp location provided by the operating system
changes
- added Biopython as a dependency to support block-gzip compression
- refactor submodules to hide application specific code
- refactor clustering classes/methods to use new names
  - non-hierarchical method named IDBCAN (Interval Density Based Clustering of Applications with Noise)
  - conservative-hierarchical method named SIDBCAN (Splitting-IDBCAN)
  - aggressive-hierarchical method named SIDBCAN-aggressive and deprecated
bug fixes
- python 3 shebang lines #98
- add MANIFEST.in to allow for github install with pip #89
documentation
- named clustering algorithms loosely based on DBSCAN
  - non-hierarchical method named IDBCAN (Interval Density Based Clustering of Applications with Noise)
  - hierarchical method named SIDBCAN (Splitting-IDBCAN)
- many corrections to documentation of clustering algorithms #99
- specify tabix command to index block-gzipped output

Assets 2

09 Feb 02:01

timothymillar

v0.2.0

f412792

Alpha-0.2.0

new features
- new 'conservative' clustering method #91
changes
- new 'conservative' clustering method used by default #91
- simplified clustering/splitting method selection with flag '--splitting-method'
- changed library imports to include loci, fingerprint, fingerprintio and cluster
- tidied cluster.py submodule code
bug fixes
- fixed miscalculation of initial cluster support #90
- fixed miscalculation of parent vs child support #95
- fixed name collision between sub module cluster and function loci.cluster
documentation
- updated method.rst with new clustering method
- updated usage.rst with new clustering method
- updated CLI help text
- updated cluster.py docstrings with new method
- added metadata for PyPI

Assets 2

09 Jan 03:07

timothymillar

v0.1.4

f8373bc

Alpha-0.1.4

Substantial changes to code base in order to accommodate new features.
Command-line-tool names and arguments have changed - see the documentation.
Core clustering algorithms are unchanged and will identify the same clusters/features as before.

Renamed tools
- remove tef wrapper as it is likely to have a name collision with something
  - tef fingerprint/compare are combined into single tool tefingerprint #63
  - tef preprocess is now tef-extract-informative
  - tef filter-gff is now tef-filter-gff (note: this tools behavior has changes substantially to handle the new gff output)
Refactored modules
- core sub-module loci.py re-written to be more flexible #82
  - single data structure for for representing collections of loci
  - arbitrary string length limit removed (use of python string in place numpy strings)
  - split tool logic into separate sub-module fingerprint.py
New Features
- tefingerprint
  - trim buffered clusters to extent of read tips
  - count n most common elements per sample in each bin #63 #81
  - use gff annotation for tagging known elements #80
  - join paired clusters using gff annotation #78
  - output files (gff, csv) are optionally pipe-able
  - output files (gff, csv) contain more detailed data
  - can read(anotation gff)/write files compressed with gzip or bz2 #86
  - escape special characters in gff files with percent encodings
- tef-filter-gff
  - changed to handle new gff output #84
  - use of --any and --all contexts for combining filters
  - unix style wild cards for matching multiple fields
  - read and write gz abd bz2 compressed files #86
  - escape special characters with percent encodings
  - read gff from standard in

Assets 2

03 Aug 01:22

timothymillar

v0.1.3

a4d61e4

Alpha-0.1.3

Performance improvements:
- Significantly reduced memory usage and improved speed for preprocess #66 #70
- Reduced memory usage for fingerprint/compare io operations #68
- Reduced memory usage for filter-gff #67
Fixes
- Corrected feature-csv output on large arrays #58 #72
Documentation
- Split readme into multiple documents
- Switched to .rst for documentation
- Added description of methods

Assets 2

11 Jul 02:23

timothymillar

v0.1.2

50d371d

Alpha-0.1.2

Faster output of CSV files (#58)
Tidied up CLI arguments (#55 #62)
- Replace underscores with dash
- Use of flags for Boolean arguments
- Updated Readme with changes
Tests
- Integration tests for fingerprint, compare and filter-gff
- Further tests for preprocess

Assets 2

27 Jun 21:54

timothymillar

v0.1.1

4f4c876

Alpha-0.1.1

Minor fixes for for cluster support calculation #53
Default selection of starting epsilon value now matches Campello et al 2015
Update terminology to reflect that used in Campello et al 2015
Tests

Assets 2

13 Jun 02:02

timothymillar

v0.1.0

5757511

Alpha-0.1.0

Pre-Processing
- Corrections for reverse-complemented reads when extracting dangler reads
- Inclusion of soft-clipped sections from the outer end of proper-mapped pairs
- Separated mapping reads to repeats from pre-processing script
Fingerprinting/Compare
- Quality score filtering options
- New output format options
Other
- Bump version to 0.1.0
- Change status to alpha
- Updated documentation
- Additional tests

Assets 2

13 Jun 01:36

timothymillar

v0.0.3

721e0b7

Tabular Data Structures Pre-release

Pre-release

Improvements:

Nicer data structures for use as a library #42
Easy inter-op with Numpy, Pandas and Plotting libraries in python #42
Improved buffering of comparative bins
Multi-process pipelines return results to parent process rather than printing directly to file
GFF output is sorted by chromosome, start position, stop position #32
Added better examples to readme #40
improved testing of fingerprint and compare data methods #6
Compare output is no longer nested and contains normalised read counts #4 and is coloured by read count proportions #43
filter_gff program is simplified (no longer deals with nested gff)
renamed module to tefingerprint and CLI to tef #3

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: PlantandFoodResearch/TEFingerprint

Alpha-0.3.2

Alpha-0.3.1

Alpha-0.3.0

Alpha-0.2.0

Alpha-0.1.4

Alpha-0.1.3

Alpha-0.1.2

Alpha-0.1.1

Alpha-0.1.0

Tabular Data Structures