Skip to content

Version 0.3.0

Compare
Choose a tag to compare
@luispedro luispedro released this 27 Apr 04:14
· 44 commits to main since this release

Main changes are updates to the Resfinder and ARG-ANNOT mappings

Detailed changes

Handling gene clusters & reverse complements in resfinder

  • Resfinder has gene clusters which can't be passed through RGI using 'contig' mode.
  • Gene clusters were identified and were manually assigned ARO numbers.
  • A seperate file with manual curation for gene clusters and RCs was created, and their AROs were updated after concatenating RGI results and genes not in RGI results.
  • 40 gene clusters present.
  • 9 genes in reverse complement form also present.

Using amino acid file for argannot & resfinder rather than nucleotide file

  • ARG-ANNOT and Resfinder are comprised of coding sequences. The data wasn't being handled properly before as contig mode was used when passing coding sequences to RGI. Now, the amino acid versions of ARG-ANNOT & Resfinder are used with protein mode when running the database in RGI.
  • ARG-ANNOT AA file is available online. Resfinder AA file is generated using biopython.
  • One to many ARO mapping such as NG_047831:101-955 to Erm(K) and almG in ARG-ANNOT eliminated as protein mode used
  • A total of 10 ARO mappings changed in ARG-ANNOT

argnorm.lib: Making argNorm more usable as a library

  • Introduce argnorm.lib module
  • Users can import the map_to_aro function from argnorm.lib. The function takes a gene name as input, maps the gene to the ARO and returns a pronto term object with the ARO mapping.
  • The get_aro_mapping_table function, previously within the BaseNormalizer class, has also been moved to lib.py to give users the ability to access the mapping tables being used for normalization.
  • With the introduction of lib.py, users will be able to access core mapping utilities through argnorm.lib, drug categorization through argnorm.drug_categorization, and the traditional normalizers through argnorm.normalizers.