Skip to content
View jcpetitto's full-sized avatar

Block or report jcpetitto

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jcpetitto/README.md

Jocelyn Petitto*, PhD, MPH, MAT

*(formerly Tourtellotte)

Bioinformatician | Research Software Engineer | Open Science Advocate

Current Position: Bioinformatician III in the Grunwald Lab @ RNA Therapeutics Institute, UMass Chan Medical School
Focus: Production-quality computational pipelines for image processing and analysis of single-molecule tracking experiments; multi-omic data integration

Current Project(s)

Tracking mRNA translocation through the NPC (Image Processing and Analysis Pipeline )

Transforming a proof-of-concept into a robust, FAIR-compliant, production-ready pipeline for single-molecule tracking mRNA transport through nuclear pore complexes.

Technical Highlights

  • Checkpoint/resume system for HPC clusters
  • Statistical rigor: LRT-based outlier detection, AIC model selection
  • Particle tracking via CNN (U-Net++) + GLRT

Impact: Open-source release supporting reproducible nuclear transport research


💻 Technical Expertise

Languages & Frameworks

Primary: Python (PyTorch, scikit-learn, pandas), R (tidyverse, Shiny, Quarto)
Also: Shell scripting, SAS, SQL

Computational Methods

  • Machine Learning: Neural networks (U-Net++, CNNs), likelihood estimation, GLRT frameworks
  • Image Analysis: Single-molecule tracking, particle detection, image registration
  • Genomics: RNA-seq, Hi-C, DamID, multi-omics integration
  • Statistics: Bayesian inference, robust statistics, survival analysis, epidemiological modeling

Software Engineering

  • Architecture: Object-oriented design, modular pipelines, separation of concerns
  • Testing: Unit/integration/regression tests, continuous integration
  • HPC: LSF/SLURM job submission, distributed computing, checkpoint/resume systems
  • Open Science: FAIR principles, reproducible research, comprehensive documentation

Specialized Tools

  • Structural Biology: Chimera, Rosetta (protein-ligand modeling)
  • Geospatial: LIDAR data processing
  • Medical Imaging: fNIRS (functional near-infrared spectroscopy)
  • Visualization: ggplot2, matplotlib, Shiny dashboards, Quarto reports

Academic Background

PhD - Bioinformatics and Computational Biology, Joint PhD Program, WPI & UMass Chan Medical School
MPH - Biostatistics & Epidemiology, Tufts University Medical School
MAT - Secondary School Mathematics, Marlboro College Graduate School BA - Mathematics & Philosophy, Boston College

Philosophy: Interdisciplinary training enables creative problem-solving and "right tool for the job" pragmatism.


Featured Work

Open-source educational platform for teaching molecular biology through hands-on model building.

  • Role: Author, Lead developer (GitHub Pages site, layout design, manual and guide creation)
  • Tools: Quarto, Adobe Creative Suite
  • Impact: Accessible STEM education resource

Public Health Data Science

Massachusetts Department of Public Health (MDPH)

  • Immunization surveillance using EHR data (MIIS) - SAS
  • Lyme disease surveillance in endemic regions (MAVEN) - SAS, R
  • Policy analysis for traveling animal exhibits - public health risk assessment

Data Experience

NGS & Genomics

  • Bulk RNA-seq
  • Hi-C (chromatin conformation)
  • DamID (chromatin accessibility)
  • Multi-omic data integration

Imaging & Tracking

  • Single-molecule fluorescence microscopy
  • Particle detection & tracking
  • Image registration & segmentation
  • U-Net++ neural networks

Clinical & Surveillance

  • EHR data (electronic health records)
  • Laboratory surveillance data
  • Epidemiological case data
  • Immunization registries

Specialized

  • LIDAR (geospatial)
  • fNIRS (neuroimaging)
  • Protein structures (3D modeling)

Development Practices

Reproducibility First:

  • Version control (Git/GitHub)
  • Comprehensive documentation
  • Automated testing
  • Containerization when appropriate

Open Science:

  • FAIR data principles
  • Open-source licensing
  • Complete methodology documentation
  • Peer-reviewed citations for all methods

Communication:

  • Dynamic visualizations (Shiny, Quarto)
  • Clear documentation for technical and non-technical audiences
  • Emphasis on interpretability

Philosophy

My research is grounded in the principles of Open Science, with a strong emphasis on producing computationally reproducible results and reusable, open-source code.

Experience working with a broad range of data types has lead me to a pragmatic, "right tool for the job" approach:

  • For data analysis and visualization R is my "go-to"
  • For specialized tasks such as building machine learning pipelines, I leverage the power of the tensor via Python for specialized tasks.
  • The "multilingual" modalities of Quarto and Shiny, enable me tocreate dynamic presentations and exploratorytools that can be used to communicate findings.

In parallel with the "right tool for the job" mentality flows an undercurrent of "right data for the question":

  • Work within the Grunwald Lab has a strong focus on metadata and data providence, which are key to having the right data with an eye on rigor and reproducibility
  • PHD bioinformatics and computational biology with coursework and rotations within a top engineering school (WPI) and dissertation research within a highly respected biomedical research institute (UMass Chan) resulted in exposure to many data types, their organization, and the level of granularity each can support
  • MPH biostatistics and epidemiology: a background in experimental design and methodologies adds to the "particular set of skills" I bring to aligning data with the research question at hand.
  • MAT: a focus on assessment design and implementation, particularly as it comes to developing understanding of tools through training, allows me to bridge the gap between intent and design

Collectively, these elements are supported by my passion for data visualization and science communication and fueled by an intense curiosity.


© 2026 Jocelyn Petitto, licensed under CC BY 4.0

Last updated: January 2026

Pinned Loading

  1. NPC_mRNA_tracking NPC_mRNA_tracking Public

    Python

  2. u2os_profiling u2os_profiling Public

    Multiomic approach to examining genomee organization juxtaposed with drug target data

    R