Pure Python port and interface to ABRicate, a tool for mass screening of contigs for antimicrobial resistance or virulence genes.
ABRicate is a Perl command-line tool wrapping BLAST+ to perform screening of contigs for antimicrobial resistance or virulence genes. It comes bundled with multiple databases: NCBI, CARD, ARG-ANNOT, Resfinder, MEGARES, EcOH, PlasmidFinder, Ecoli_VF and VFDB.
pyabricate is a pure-Python, batteries-included port of ABRicate, using the
NCBI C++ Toolkit interface wrapped in pyncbitk
to provide BLAST+ rather than using the BLAST+ binaries. It bundles the ABRIcate
databases so that no additional data or dependencies are needed.
This project is supported on Python 3.7 and later.
PyABRicate can be installed directly from PyPI, which hosts some pure-Python wheels that also bundle the ABRicate databases.
$ pip install pyabricateThe command line of the original abricate script can be executed with a
similar interface from a shell, and produces the same sort of table output:
$ pyabricate assembly.fa --mincov 50 --minid 50 --db ncbi
assembly.fa LGJG01000041 35416 35844 - fosB-251804940 1-429/429 =============== 0/0 100.00 100.00 ncbi NG_047889.1 FosB family fosfomycin resistance bacillithiol transferase
FOSFOMYCIN
assembly.fa LGJG01000040 190796 191281 + dfrC 1-486/486 =============== 0/0 100.00 99.59 ncbi NG_047752.1 trimethoprim-resistant dihydrofolate reductase DfrC
TRIMETHOPRIM
assembly.fa LGJG01000038 62786 64543 - blaR1 1-1758/1758 =============== 0/0 100.00 92.83 ncbi NG_047539.1 beta-lactam sensor/signal transducer BlaR1
BETA-LACTAM
assembly.fa LGJG01000038 64650 65495 + blaZ 1-846/846 =============== 0/0 100.00 96.81 ncbi NG_055999.1 penicillin-hydrolyzing class A beta-lactamase BlaZ
BETA-LACTAM
assembly.fa LGJG01000038 62416 62796 - blaI_of_Z 1-381/381 =============== 0/0 100.00 95.28 ncbi NG_047499.1 penicillinase repressor BlaI
BETA-LACTAMHowever, pyabricate also features an API which can be used to programmatically
annotate any sequence:
import pyabricate
database = pyabricate.Database.from_name("ncbi")
abricate = pyabricate.ResistanceGeneFinder(database, min_coverage=50, min_identity=50)
sequence = "ATATTA..." # sequence in string format
for hit in abricate.find_genes(sequence):
print(
hit.gene.name, # resistance / virulence gene
hit.alignment[0].start, # start coordinate in query sequence
hit.alignment[0].stop, # stop coordinate in query sequence
hit.percent_coverage,
hit.percent_identity
)The returned Hit objects contain all the information needed to build the
table output in an object-oriented interface. ResistanceGeneFinder.find_genes
accepts sequences as Python strings, which can be loaded with any other
library such as Biopython.
Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.
Contributions are more than welcome! See
CONTRIBUTING.md
for more details.
This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.
This library is provided under the GNU General Public License 3.0 or later.
ABRicate was developed by Torsten Seemann and
is redistributed under the terms of the GNU General Public License 2.0,
see vendor/abricate/LICENSE.
This project is in no way not affiliated, sponsored, or otherwise endorsed by the original ABRIcate authors. It was developed by Martin Larralde during his PhD at the Leiden University Medical Center in the Zeller team.