Skip to content

Coevolving-Lab/nematode-fecundity-computer-vision-counting

Repository files navigation

Nematode Fecundity Computer Vision Counting

A complete pipeline for automated nematode (C. elegans) counting in fecundity assays using deep learning-based computer vision.

Overview

This repository provides a reproducible workflow for automatically counting nematode offspring in microscopy images, replacing manual counting procedures. The system processes raw microscopy images through preprocessing, object detection, and quantification stages to produce accurate nematode counts.

Key Features

  • Automated Preprocessing: Detects and crops petri dishes from raw microscopy images
  • Deep Learning Detection: Uses optimized YOLO models for accurate nematode detection
  • Robust Optimization: 5-fold cross-validation for confidence threshold selection
  • Complete Pipeline: From raw images to final experimental data integration
  • Biology-Friendly: Designed for researchers with basic Python knowledge

Quick Start

1. Installation

Option A: Using Conda (Recommended)

# Clone the repository
git clone https://github.com/gibson-lab/nematode-fecundity-computer-vision-counting.git
cd nematode-fecundity-computer-vision-counting

# Create conda environment
conda env create -f environment.yml
conda activate nematode-counting

Option B: Using pip

# Clone the repository
git clone https://github.com/gibson-lab/nematode-fecundity-computer-vision-counting.git
cd nematode-fecundity-computer-vision-counting

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

2. Run the Pipeline

Option A: Interactive Notebooks (Recommended for beginners)

# Start Jupyter notebook server
jupyter notebook

# Follow the demo notebooks in order:
notebooks/01_preprocessing_demo.ipynb               # Learn image preprocessing
notebooks/02_pretrained_model_comparison.ipynb      # Compare YOLO architectures
notebooks/03_model_finetuning.ipynb                 # Hyperparameter optimization
notebooks/04_inference_demo.ipynb                   # Run nematode counting

Option B: Command Line Interface

# 1. Preprocess raw TIFF images
python src/preprocessing.py --input data/raw_images --output data/processed_images

# 2. Run inference on processed images
python src/inference.py --input data/processed_images --output results/counts.csv --model models/finetuned/best_model.pt

# 3. Process experimental dataset with image mapping
python -c "
from src.inference import run_inference_on_experimental_data
run_inference_on_experimental_data(
    model_path='models/finetuned/best_model.pt',
    experimental_csv='data/experiment_data/HW_fecundity.csv',
    image_mapping_csv='data/experiment_data/image_mapping.csv',
    image_root_dir='data/processed_images',
    output_csv='results/final_counts.csv'
)
"

Project Structure

nematode-fecundity-computer-vision-counting/
├── src/                              # Core Python modules
│   ├── preprocessing.py              # Image preprocessing and petri dish detection
│   ├── training.py                   # YOLO model training and hyperparameter optimization
│   ├── evaluation.py                 # Model evaluation and metrics computation
│   ├── inference.py                  # Batch inference and experimental data integration
│   └── utils.py                      # Utility functions and helpers
├── notebooks/                        # Interactive demonstration notebooks
│   ├── 01_preprocessing_demo.ipynb   # Image preprocessing tutorial
│   ├── 02_pretrained_model_comparison.ipynb  # YOLO architecture comparison
│   ├── 03_model_finetuning.ipynb     # Hyperparameter optimization analysis
│   ├── 04_inference_demo.ipynb       # Nematode counting and production workflow
│   └── README.md                     # Notebooks usage guide
├── data/                             # Sample datasets and documentation
│   ├── sample_raw_images/            # Example raw TIFF images (20 samples)
│   ├── sample_processed_images/      # Corresponding processed PNG images
│   ├── annotations_full/             # Complete annotated dataset (331 images)
│   │   ├── train/                    # Training split (231 images)
│   │   ├── valid/                    # Validation split (50 images)
│   │   └── test/                     # Test split (50 images)
│   ├── experiment_data/              # Experimental metadata
│   │   ├── HW_fecundity.csv         # Complete experimental data
│   │   └── image_mapping.csv        # Image-to-experiment mapping
│   └── README.md                     # Data format documentation
├── models/                           # Model weights and configuration
│   ├── finetuned/                    # Best trained models
│   │   ├── best_model.pt            # Best overall model (MAE: 0.95)
│   │   └── best_model_config.json   # Model configuration and parameters
│   └── pretrained/                   # Base YOLO models
│       ├── yolo11n.pt               # YOLOv11n base model
│       └── yolov8l.pt               # YOLOv8-L base model
├── tests/                            # Automated test suite
│   ├── test_preprocessing.py         # Preprocessing module tests
│   ├── test_inference.py             # Inference module tests
│   ├── test_evaluation.py            # Evaluation module tests
│   ├── test_training.py              # Training module tests
│   └── test_utils.py                 # Utilities module tests        
├── Configuration files
│   ├── requirements.txt              # Python dependencies
│   ├── environment.yml               # Conda environment
│   ├── pytest.ini                    # Test configuration
│   └── run_tests.py                  # Custom test runner
└── Documentation
    ├── README.md                     # This file
    ├── DESIGN_DOCUMENT.md            # Technical architecture and context
    ├── PROGRESS.md                   # Development progress tracking
    └── LICENSE                       # MIT license

Method Overview

1. Image Preprocessing

  • Input: Raw TIFF images (2592×1944 pixels)
  • Process: Detect petri dishes using Hough Circle Transform
  • Output: Cropped and standardized PNG images (1528×1528 pixels)

2. Object Detection Training

  • Dataset: 331 manually annotated images
  • Models: Compared 8 YOLO variants (v8-v12, L and X sizes)
  • Best Model: YOLOv11-L with optimized hyperparameters
  • Performance: 94.6% precision, 92.0% recall, MAE of 0.95

3. Inference and Integration

  • Confidence Optimization: 5-fold cross-validation
  • Batch Processing: Automated counting across full datasets
  • Data Integration: Merge counts with experimental metadata

Hardware Requirements

  • Minimum: 8GB RAM, CPU-only processing
  • Recommended: 16GB RAM, GPU with 8GB VRAM

Adapting for Your Data

Image Requirements

  • Petri dish detection: Clear, circular dish boundaries for automatic detection
  • Lighting: Consistent illumination without harsh shadows or reflections
  • Format: TIFF (preferred) or high-quality PNG images
  • Resolution: Minimum 1000×1000 pixels for reliable detection
  • Background: Specimens should be clearly distinguishable from dish background
  • Focus: Sharp focus across the entire petri dish area

Recommended Data Organization

your_project/
├── raw_images/
│   ├── experiment1/
│   │   ├── Image0001_21-12-13_12-35-52.tiff
│   │   ├── Image0002_21-12-13_12-36-15.tiff
│   │   └── ...
│   └── experiment2/
│       ├── Image0050_21-12-14_09-15-30.tiff
│       └── ...
├── processed_images/
│   ├── experiment1/
│   └── experiment2/
├── experimental_data.csv        # Links images to experimental conditions
└── image_mapping.csv           # Maps experimental units to specific images

Model Retraining Guidelines

When to retrain:

  • Different organism species (not C. elegans)
  • Significantly different imaging setup or conditions
  • Different petri dish types or sizes
  • Different specimen density ranges
  • Poor performance on your data (MAE > 2.0)

Retraining process:

  1. Collect annotations: 200-500 manually annotated images minimum

    • Use Roboflow, CVAT, or similar annotation tools
    • Export in YOLO format
    • Ensure diverse conditions (different densities, lighting, etc.)
  2. Prepare data structure:

    your_annotations/
    ├── train/
    │   ├── images/     # 70% of annotated images
    │   └── labels/     # Corresponding YOLO format labels
    ├── valid/
    │   ├── images/     # 20% of annotated images
    │   └── labels/
    └── test/
        ├── images/     # 10% of annotated images
        └── labels/
    
  3. Create data.yaml:

    path: /path/to/your_annotations
    train: train/images
    val: valid/images
    test: test/images
    nc: 1
    names: ['nematode']  # or your organism name
  4. Train and evaluate: Follow the training examples above

Parameter Tuning for Different Conditions

Preprocessing parameters (adjust in crop_petri_dish_region):

  • cutoff: 0.7-0.9 (higher for cleaner images, lower for noisy images)
  • min_radius/max_radius: Adjust based on your petri dish size in pixels
  • target_size: Keep at 1528×1528 for compatibility with provided models

Inference parameters:

  • confidence_threshold: Use threshold optimization to find optimal value
  • batch_size: Adjust based on GPU memory (1-8 typical range)

Common adjustments by imaging setup:

  • High magnification: May need smaller min_radius/max_radius
  • Low contrast: Lower cutoff value (0.7-0.8)
  • Variable lighting: Consider retraining with diverse lighting conditions
  • Different organisms: Likely need full retraining with species-specific annotations

Citation

If you use this code in your research, please cite:

[Citation information will be added upon publication]

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • The Gibson Lab at University of Virginia
  • Roboflow platform for annotation tools
  • Ultralytics for the YOLO framework

Note: This repository contains sample data for demonstration. For the complete dataset used in our study, please contact the authors.