A complete pipeline for automated nematode (C. elegans) counting in fecundity assays using deep learning-based computer vision.
This repository provides a reproducible workflow for automatically counting nematode offspring in microscopy images, replacing manual counting procedures. The system processes raw microscopy images through preprocessing, object detection, and quantification stages to produce accurate nematode counts.
- Automated Preprocessing: Detects and crops petri dishes from raw microscopy images
- Deep Learning Detection: Uses optimized YOLO models for accurate nematode detection
- Robust Optimization: 5-fold cross-validation for confidence threshold selection
- Complete Pipeline: From raw images to final experimental data integration
- Biology-Friendly: Designed for researchers with basic Python knowledge
Option A: Using Conda (Recommended)
# Clone the repository
git clone https://github.com/gibson-lab/nematode-fecundity-computer-vision-counting.git
cd nematode-fecundity-computer-vision-counting
# Create conda environment
conda env create -f environment.yml
conda activate nematode-countingOption B: Using pip
# Clone the repository
git clone https://github.com/gibson-lab/nematode-fecundity-computer-vision-counting.git
cd nematode-fecundity-computer-vision-counting
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtOption A: Interactive Notebooks (Recommended for beginners)
# Start Jupyter notebook server
jupyter notebook
# Follow the demo notebooks in order:
notebooks/01_preprocessing_demo.ipynb # Learn image preprocessing
notebooks/02_pretrained_model_comparison.ipynb # Compare YOLO architectures
notebooks/03_model_finetuning.ipynb # Hyperparameter optimization
notebooks/04_inference_demo.ipynb # Run nematode countingOption B: Command Line Interface
# 1. Preprocess raw TIFF images
python src/preprocessing.py --input data/raw_images --output data/processed_images
# 2. Run inference on processed images
python src/inference.py --input data/processed_images --output results/counts.csv --model models/finetuned/best_model.pt
# 3. Process experimental dataset with image mapping
python -c "
from src.inference import run_inference_on_experimental_data
run_inference_on_experimental_data(
model_path='models/finetuned/best_model.pt',
experimental_csv='data/experiment_data/HW_fecundity.csv',
image_mapping_csv='data/experiment_data/image_mapping.csv',
image_root_dir='data/processed_images',
output_csv='results/final_counts.csv'
)
"nematode-fecundity-computer-vision-counting/
├── src/ # Core Python modules
│ ├── preprocessing.py # Image preprocessing and petri dish detection
│ ├── training.py # YOLO model training and hyperparameter optimization
│ ├── evaluation.py # Model evaluation and metrics computation
│ ├── inference.py # Batch inference and experimental data integration
│ └── utils.py # Utility functions and helpers
├── notebooks/ # Interactive demonstration notebooks
│ ├── 01_preprocessing_demo.ipynb # Image preprocessing tutorial
│ ├── 02_pretrained_model_comparison.ipynb # YOLO architecture comparison
│ ├── 03_model_finetuning.ipynb # Hyperparameter optimization analysis
│ ├── 04_inference_demo.ipynb # Nematode counting and production workflow
│ └── README.md # Notebooks usage guide
├── data/ # Sample datasets and documentation
│ ├── sample_raw_images/ # Example raw TIFF images (20 samples)
│ ├── sample_processed_images/ # Corresponding processed PNG images
│ ├── annotations_full/ # Complete annotated dataset (331 images)
│ │ ├── train/ # Training split (231 images)
│ │ ├── valid/ # Validation split (50 images)
│ │ └── test/ # Test split (50 images)
│ ├── experiment_data/ # Experimental metadata
│ │ ├── HW_fecundity.csv # Complete experimental data
│ │ └── image_mapping.csv # Image-to-experiment mapping
│ └── README.md # Data format documentation
├── models/ # Model weights and configuration
│ ├── finetuned/ # Best trained models
│ │ ├── best_model.pt # Best overall model (MAE: 0.95)
│ │ └── best_model_config.json # Model configuration and parameters
│ └── pretrained/ # Base YOLO models
│ ├── yolo11n.pt # YOLOv11n base model
│ └── yolov8l.pt # YOLOv8-L base model
├── tests/ # Automated test suite
│ ├── test_preprocessing.py # Preprocessing module tests
│ ├── test_inference.py # Inference module tests
│ ├── test_evaluation.py # Evaluation module tests
│ ├── test_training.py # Training module tests
│ └── test_utils.py # Utilities module tests
├── Configuration files
│ ├── requirements.txt # Python dependencies
│ ├── environment.yml # Conda environment
│ ├── pytest.ini # Test configuration
│ └── run_tests.py # Custom test runner
└── Documentation
├── README.md # This file
├── DESIGN_DOCUMENT.md # Technical architecture and context
├── PROGRESS.md # Development progress tracking
└── LICENSE # MIT license
- Input: Raw TIFF images (2592×1944 pixels)
- Process: Detect petri dishes using Hough Circle Transform
- Output: Cropped and standardized PNG images (1528×1528 pixels)
- Dataset: 331 manually annotated images
- Models: Compared 8 YOLO variants (v8-v12, L and X sizes)
- Best Model: YOLOv11-L with optimized hyperparameters
- Performance: 94.6% precision, 92.0% recall, MAE of 0.95
- Confidence Optimization: 5-fold cross-validation
- Batch Processing: Automated counting across full datasets
- Data Integration: Merge counts with experimental metadata
- Minimum: 8GB RAM, CPU-only processing
- Recommended: 16GB RAM, GPU with 8GB VRAM
- Petri dish detection: Clear, circular dish boundaries for automatic detection
- Lighting: Consistent illumination without harsh shadows or reflections
- Format: TIFF (preferred) or high-quality PNG images
- Resolution: Minimum 1000×1000 pixels for reliable detection
- Background: Specimens should be clearly distinguishable from dish background
- Focus: Sharp focus across the entire petri dish area
your_project/
├── raw_images/
│ ├── experiment1/
│ │ ├── Image0001_21-12-13_12-35-52.tiff
│ │ ├── Image0002_21-12-13_12-36-15.tiff
│ │ └── ...
│ └── experiment2/
│ ├── Image0050_21-12-14_09-15-30.tiff
│ └── ...
├── processed_images/
│ ├── experiment1/
│ └── experiment2/
├── experimental_data.csv # Links images to experimental conditions
└── image_mapping.csv # Maps experimental units to specific images
When to retrain:
- Different organism species (not C. elegans)
- Significantly different imaging setup or conditions
- Different petri dish types or sizes
- Different specimen density ranges
- Poor performance on your data (MAE > 2.0)
Retraining process:
-
Collect annotations: 200-500 manually annotated images minimum
- Use Roboflow, CVAT, or similar annotation tools
- Export in YOLO format
- Ensure diverse conditions (different densities, lighting, etc.)
-
Prepare data structure:
your_annotations/ ├── train/ │ ├── images/ # 70% of annotated images │ └── labels/ # Corresponding YOLO format labels ├── valid/ │ ├── images/ # 20% of annotated images │ └── labels/ └── test/ ├── images/ # 10% of annotated images └── labels/ -
Create data.yaml:
path: /path/to/your_annotations train: train/images val: valid/images test: test/images nc: 1 names: ['nematode'] # or your organism name
-
Train and evaluate: Follow the training examples above
Preprocessing parameters (adjust in crop_petri_dish_region):
cutoff: 0.7-0.9 (higher for cleaner images, lower for noisy images)min_radius/max_radius: Adjust based on your petri dish size in pixelstarget_size: Keep at 1528×1528 for compatibility with provided models
Inference parameters:
confidence_threshold: Use threshold optimization to find optimal valuebatch_size: Adjust based on GPU memory (1-8 typical range)
Common adjustments by imaging setup:
- High magnification: May need smaller
min_radius/max_radius - Low contrast: Lower
cutoffvalue (0.7-0.8) - Variable lighting: Consider retraining with diverse lighting conditions
- Different organisms: Likely need full retraining with species-specific annotations
If you use this code in your research, please cite:
[Citation information will be added upon publication]This project is licensed under the MIT License - see the LICENSE file for details.
- The Gibson Lab at University of Virginia
- Roboflow platform for annotation tools
- Ultralytics for the YOLO framework
Note: This repository contains sample data for demonstration. For the complete dataset used in our study, please contact the authors.