This repository contains a simple pipeline to prepare and format the dataset of genomic junctions from the E. coli ST131 collection described in our paper.
To run the pipeline you will need to have Snakemake (tested on v9.11) and Conda installed. Moreover you will need to have the PanGraph (v1.2.1) binary available in your PATH.
Run the pipeline simply with:
snakemake --use-conda --cores <num_cores> allYou can explore junctions visually with marimo by running:
marimo run explore/view_junctions.py