This document outlines the changes made to the project with each release.
- Added
thin
andrandom_subset
options tonremover()
function.thin
removes loci withinthin
bases of the nearest locus.random_subset
randomly subsets the loci using an integer or proportion.
- Changed
unlinked
tounlinked_only
option for clarity.
- Added functionality to filter out linked SNPs using CHROM and POS fields from VCF file.
- Made the Sankey plot function more modular and dynamic for easier maintainability.
- Fix spacing between printed STDOUT.
- Fixed bug where CHROM VCF field had strings cut off at 10 characters.
- Fixed copy method for use with pysam VariantHeader objects.
- Performance improvements for VCF files.
- Load and write VCF file in chunks of loci to improve memory consumption.
- New output directory structure for better organization.
- VCF file attributes are now written to an HDF5 file instead of all being loaded into memory.
- Increased usage of numpy to improve VCF IO.
- Added AF INFO field when converting PHYLIP or STRUCTURE files to VCF format.
- VCF file reading uses pysam instead of cyvcf2 now.
- Fixed bug with
search_threshold
plots where the x-axis values would be sorted as strings instead of integers. - Fixed bugs where sampleIDs were out of order for VCF files.
- Ensured correct order for all objects.
- Fixed bugs when subsetting with popmaps files.
- Fixed to documentation.
- Fix for VCF FORMAT field being in the wrong order.
- Band-aid fix for incorrect order of sampleIDs in VCF files.
- Reads and writes PHYLIP, STRUCTURE, and VCF files.
- Loads data into GenotypeData object.
- Filters DNA sequence alignments using NRemover2.
- Filters by minor allele frequency, monomorphic, and non-biallelic sites.
- Filters with global (whole columns) and per-population, per-locus missing data thresholds.
- Makes informative plots.