Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/reference alleles #14

Merged
merged 22 commits into from
Feb 27, 2024

Conversation

luissian
Copy link
Member

@luissian luissian commented Feb 5, 2024

Implementation of this feature is done by creating 2 additional classes: for getting the candidate alleles

  • using distance matrix(mash)
  • clustering (using Leigde algorithm)

On this PR , reference alleles are selected by choosing an allele for each cluster. The cluster is done using Leigde algorithm that get as input the distance matrix created by mash program. Resolution parameter is used for Leigde algorithm to group neighborhood alleles. The resolution default or user defined value is increased by 0.025 if exists any cluster that the list of the cluster are not the same as got by blast.
Generated files are:

  • reference alleles
  • summary cluster
  • evaluation
  • statistics graphic to represent the number of locus and number of clusters. for example if on x axis =3 and y axis y = 8 means that there are 3 locus that all alleles are grouped in 8 clusters.
    num_genes_per_allele

taranis/distance.py Outdated Show resolved Hide resolved
taranis/clustering.py Outdated Show resolved Hide resolved
taranis/clustering.py Outdated Show resolved Hide resolved
@@ -26,6 +26,6 @@ jobs:
run: |
source $CONDA/etc/profile.d/conda.sh
conda activate taranis_env
poetry install
python -m pip install .
taranis analyze-schema -i test/MLST_listeria -o analyze_schema_test --cpus 1 --output-allele-annot --remove-no-cds --remove-duplicated --remove-subset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add the test for the reference_alleles functionality

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to finish coding first, as I would like to start using pytest for testing

setup.py Show resolved Hide resolved
@luissian luissian marked this pull request as ready for review February 17, 2024 19:44
@saramonzon saramonzon merged commit 9efcc58 into BU-ISCIII:develop Feb 27, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants