All notable changes to this project will be documented in this file. This project adheres to Semantic Versioning.
- A bug with handling of the "Parent" attribute for features with multiple parents (see #85).
- New
NamedIndex
class for storing and retrieving features by ID (see #82). - New CLI command
tag pep2nuc
for transforming feature coordinates from peptide space to nucleotide space (see #83).
- New
tag.select.merge
function to efficiently merge sorted feature streams (see #68). - New
tag.locus
module for efficient parsing locus coordinates from multiple annotation streams (see #71). - New experimental
tag.bae
module for evaluating bacterial annotation (see #73, #76).
- Minor updates to compensate for a couple years' worth of neglect (see #64).
- Refactored
GFF3Reader
to better support processing of unsorted GFF3 data (see #65). - Implemented finer control of how and when output separator directives (
###
) are printed to GFF3 output (see #68, #75). - Reorganized test suite code and data (see #70).
- Refactored the
Feature
andScore
APIs with more sane default and static constructors (see #74).
- The
GFF3Writer
class now behaves as expected when.retainids
is set toTrue
(see #79).
- Missing
extent
query from the index implementation. - Aliased
index.keys()
toindex.seqids
.
- Script
tag sum
to provide very basic summaries of genomic GFF3 files.
- Pseudo-features for better handling and sorting of top-level multi-features.
- A new
primary_transcript
filter as a generalization of theprimary_mrna
function. - A new function to query features for NCBI
GeneID
values. - A new function to traverse a feature and all of its children to collect all attribute values associate with a given key.
- Bug with non-protein coding genes and the
tag.mrna.primary
filter (nowprimary_mrna
in thetag.transcript
module). - Bug with how the GFF3 writer handles multi-feature IDs.
- Refactored the
mrna
module, extended it, and renamed it totranscript
to reflect its new and broader scope.
- Range overlap queries accidentally left out of the previous release.
- New convenience functions in the
Range
andFeature
classes for range and point overlap queries.
- An index class for efficient in-memory access of sequence features.
- Module for mRNA handling, with a function for selecting the primary mRNA from a gene or other feature.
- New CLI command
tag pmrna
. - A new
Score
class for internal handling of feature scores. Not yet included in the API, and may not ever be.
- Modules focused on classes / data structure now support more concise imports
(for example,
from tag import Feature
andtag.Feature
now supported and preferred overfrom tag.feature import feature
andtag.feature.Feature
).
- Resolved a bug with the GFF3Writer failing to print
##FASTA
directives before writing sequences to output.
- CLI implemented using
entry_points
instead of a dedicated script.
- Entry type inference now correct by inheriting from
object
.
- Basic data structures
- Range
- Comment
- Directive
- Sequence
- Feature
- Annotation I/O
- GFF3Reader
- GFF3Writer
- Composable generator functions for streaming annotation processing
- A command line interface through the
tag
script - Package scaffolding
- README
- documentation
- license
- changelog
- various config files