Bird Detection Pipeline

A machine learning pipeline for detecting and annotating birds in aerial imagery.

Project Structure

project_root/
│
├── src/                      # Source code for the ML pipeline
│   ├── __init__.py
│   ├── data_ingestion.py    # Data loading and preparation
│   ├── data_processing.py   # Data preprocessing and transformations
│   ├── model_training.py    # Model training functionality
│   ├── pipeline_evaluation.py # Pipeline and model evaluation metrics
│   ├── model_deployment.py  # Model deployment utilities
│   ├── monitoring.py        # Monitoring and logging functionality
│   ├── reporting.py         # Report generation for pipeline results
│   ├── pre_annotation_prediction.py  # Pre-annotation model predictions
│   └── annotation/          # Annotation-related functionality
│       ├── __init__.py
│       └── pipeline.py      # Annotation pipeline implementation
│
├── tests/                   # Test files for each component
│   ├── test_data_ingestion.py
│   ├── test_data_processing.py
│   ├── test_model_training.py
│   ├── test_pipeline_evaluation.py
│   ├── test_model_deployment.py
│   ├── test_monitoring.py
│   ├── test_reporting.py
│
├── conf/                    # Configuration files
│   └── config.yaml         # Main configuration file
│
├── main.py                 # Main entry point for the pipeline
├── run_ml_workflow.sh      # Script to run pipeline in Serenity container
├── requirements.txt        # Project dependencies
├── .gitignore             # Git ignore file
├── CONTRIBUTING.md        # Contributing guidelines
├── LICENSE                # Project license
└── README.md             # This file

Components

Source Code (`src/`)

data_ingestion.py: Handles data loading and initial preparation
data_processing.py: Implements data preprocessing and transformations
model_training.py: Contains model training logic
pipeline_evaluation.py: Evaluates pipeline performance and model metrics
model_deployment.py: Manages model deployment
monitoring.py: Provides monitoring and logging capabilities
reporting.py: Generates reports for pipeline results
annotation/: Contains annotation-related functionality
- pipeline.py: Implements the annotation pipeline

Tests (`tests/`)

Contains test files corresponding to each component in src/. Uses pytest for testing.

Configuration (`conf/`)

Contains YAML configuration files managed by Hydra:

config.yaml: Main configuration file defining pipeline parameters

Installation

Clone the repository:

git clone https://github.com/your-username/project-name.git
cd project-name

Install dependencies:

pip install -r requirements.txt

Usage

Running the Pipeline

Using the Serenity container:

./run_ml_workflow.sh your-branch-name

Or directly with Python:

python main.py

Running Tests

pytest tests/

Configuration

The pipeline uses Hydra for configuration management. Main configuration options are defined in conf/config.yaml.

Example configuration:

data:
  input_dir: "path/to/input"
  output_dir: "path/to/output"

model:
  type: "classification"
  parameters:
    learning_rate: 0.001
    batch_size: 32

pipeline:
  steps:
    - data_ingestion
    - data_processing
    - model_training
    - evaluation

Contributing

Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Dependencies

Key dependencies include:

Hydra
PyTorch
NumPy
Pandas
Pytest

See requirements.txt for a complete list.

Development

Code Organization

Each component is a separate module in the src/ directory
Tests mirror the source code structure in the tests/ directory
Configuration is managed through Hydra
Monitoring and logging are integrated throughout the pipeline using comet

Testing

Tests are written using pytest
Each component has its own test file
Run tests with pytest tests/

Adding New Components

Create a new module in src/
Add corresponding test file in tests/
Update configuration in conf/config.yaml
Update main.py to integrate the new component
Create a branch and push your changes to the remote repository
Create a pull request to merge your changes into the main branch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bird Detection Pipeline

Project Structure

Components

Source Code (`src/`)

Tests (`tests/`)

Configuration (`conf/`)

Installation

Usage

Running the Pipeline

Running Tests

Configuration

Contributing

License

Dependencies

Development

Code Organization

Testing

Adding New Components

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
conf		conf
src		src
tests		tests
.gitignore		.gitignore
BOEM.qmd		BOEM.qmd
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
run_ml_workflow.sh		run_ml_workflow.sh
setup.py		setup.py
streamlit_app.py		streamlit_app.py

License

weecology/BOEM

Folders and files

Latest commit

History

Repository files navigation

Bird Detection Pipeline

Project Structure

Components

Source Code (src/)

Tests (tests/)

Configuration (conf/)

Installation

Usage

Running the Pipeline

Running Tests

Configuration

Contributing

License

Dependencies

Development

Code Organization

Testing

Adding New Components

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Source Code (`src/`)

Tests (`tests/`)

Configuration (`conf/`)

Packages