*(formerly Tourtellotte)
Bioinformatician | Research Software Engineer | Open Science Advocate
Current Position: Bioinformatician III in the Grunwald Lab @ RNA Therapeutics Institute, UMass Chan Medical School
Focus: Production-quality computational pipelines for image processing and analysis of single-molecule tracking experiments; multi-omic data integration
Transforming a proof-of-concept into a robust, FAIR-compliant, production-ready pipeline for single-molecule tracking mRNA transport through nuclear pore complexes.
Technical Highlights
- Checkpoint/resume system for HPC clusters
- Statistical rigor: LRT-based outlier detection, AIC model selection
- Particle tracking via CNN (U-Net++) + GLRT
Impact: Open-source release supporting reproducible nuclear transport research
Primary: Python (PyTorch, scikit-learn, pandas), R (tidyverse, Shiny, Quarto)
Also: Shell scripting, SAS, SQL
- Machine Learning: Neural networks (U-Net++, CNNs), likelihood estimation, GLRT frameworks
- Image Analysis: Single-molecule tracking, particle detection, image registration
- Genomics: RNA-seq, Hi-C, DamID, multi-omics integration
- Statistics: Bayesian inference, robust statistics, survival analysis, epidemiological modeling
- Architecture: Object-oriented design, modular pipelines, separation of concerns
- Testing: Unit/integration/regression tests, continuous integration
- HPC: LSF/SLURM job submission, distributed computing, checkpoint/resume systems
- Open Science: FAIR principles, reproducible research, comprehensive documentation
- Structural Biology: Chimera, Rosetta (protein-ligand modeling)
- Geospatial: LIDAR data processing
- Medical Imaging: fNIRS (functional near-infrared spectroscopy)
- Visualization: ggplot2, matplotlib, Shiny dashboards, Quarto reports
PhD - Bioinformatics and Computational Biology, Joint PhD Program, WPI & UMass Chan Medical School
MPH - Biostatistics & Epidemiology, Tufts University Medical School
MAT - Secondary School Mathematics, Marlboro College Graduate School
BA - Mathematics & Philosophy, Boston College
Philosophy: Interdisciplinary training enables creative problem-solving and "right tool for the job" pragmatism.
Open-source educational platform for teaching molecular biology through hands-on model building.
- Role: Author, Lead developer (GitHub Pages site, layout design, manual and guide creation)
- Tools: Quarto, Adobe Creative Suite
- Impact: Accessible STEM education resource
Massachusetts Department of Public Health (MDPH)
- Immunization surveillance using EHR data (MIIS) - SAS
- Lyme disease surveillance in endemic regions (MAVEN) - SAS, R
- Policy analysis for traveling animal exhibits - public health risk assessment
- Bulk RNA-seq
- Hi-C (chromatin conformation)
- DamID (chromatin accessibility)
- Multi-omic data integration
- Single-molecule fluorescence microscopy
- Particle detection & tracking
- Image registration & segmentation
- U-Net++ neural networks
- EHR data (electronic health records)
- Laboratory surveillance data
- Epidemiological case data
- Immunization registries
- LIDAR (geospatial)
- fNIRS (neuroimaging)
- Protein structures (3D modeling)
Reproducibility First:
- Version control (Git/GitHub)
- Comprehensive documentation
- Automated testing
- Containerization when appropriate
Open Science:
- FAIR data principles
- Open-source licensing
- Complete methodology documentation
- Peer-reviewed citations for all methods
Communication:
- Dynamic visualizations (Shiny, Quarto)
- Clear documentation for technical and non-technical audiences
- Emphasis on interpretability
My research is grounded in the principles of Open Science, with a strong emphasis on producing computationally reproducible results and reusable, open-source code.
Experience working with a broad range of data types has lead me to a pragmatic, "right tool for the job" approach:
- For data analysis and visualization R is my "go-to"
- For specialized tasks such as building machine learning pipelines, I leverage the power of the tensor via Python for specialized tasks.
- The "multilingual" modalities of Quarto and Shiny, enable me tocreate dynamic presentations and exploratorytools that can be used to communicate findings.
In parallel with the "right tool for the job" mentality flows an undercurrent of "right data for the question":
- Work within the Grunwald Lab has a strong focus on metadata and data providence, which are key to having the right data with an eye on rigor and reproducibility
- PHD bioinformatics and computational biology with coursework and rotations within a top engineering school (WPI) and dissertation research within a highly respected biomedical research institute (UMass Chan) resulted in exposure to many data types, their organization, and the level of granularity each can support
- MPH biostatistics and epidemiology: a background in experimental design and methodologies adds to the "particular set of skills" I bring to aligning data with the research question at hand.
- MAT: a focus on assessment design and implementation, particularly as it comes to developing understanding of tools through training, allows me to bridge the gap between intent and design
Collectively, these elements are supported by my passion for data visualization and science communication and fueled by an intense curiosity.
© 2026 Jocelyn Petitto, licensed under CC BY 4.0
Last updated: January 2026
