Data analysis code for Yu et al. (2023)
Python >= 3.10
numpy
scipy
matplotlib
seaborn
tslearn
scikit-learn
Simply clone the repository, create and activate a new environment (recommended), run pip install -r requirements.txt, download the data files, and run through the jupyter notebooks.
DTW_Analysis.ipynb contains the script for the dynamic time warping analysis of protein translocation unidirectionality.
GBC_Analysis.ipynb contains the script for machine learning classification and discrimination of protein translocation events via a Gradient Boosting Classifier (GBC) model.
This analysis was written and performed by Ali Fallahi and Amr Makhamreh (Department of Bioengineering, Northeastern University). The pypore package was originally written by Jacob Schreiber and Kevin Karplus. We slightly modified it to make it compatible with python 3 and our datafile types.
To replicate the analysis, download the labeled data and the unlabeled data. All data for this work is located here.
Please cite this work as:
Yu, L., Kang, X., Li, F. et al. Unidirectional single-file transport of full-length proteins through a nanopore. Nat Biotechnol (2023). https://doi.org/10.1038/s41587-022-01598-3