Skip to content

gstinoco/mGFD_EcoRisk_Simulator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mGFD EcoRisk Simulator 🌊

Project logo

GitHub Python NumPy Pandas SciPy Scikit-learn Matplotlib Seaborn License: MIT

Hybrid methodology for numerical simulation and ecological risk classification

2D advection–diffusion model + machine learning to map risk zones (Low, Medium, High)

🔗 Quick Links

🌐 Live Demo 🚀 Quick Start 📦 Install 🧮 Model 🗂️ Dataset 📈 Benchmarks 🎬 Visualizations 👥 Team 🤝 Contribute 🏭 Partners 🙏 Thanks


📋 Table of Contents


🌟 Overview

This repository implements a hybrid methodology to assess ecological risk associated with contaminants in water bodies (e.g., rivers and channels), integrating:

  • Numerical simulation of transport (2D advection–diffusion model with decay) to generate concentration fields over the domain.
  • Feature engineering from results (spatial, temporal, and hydrodynamic).
  • Machine learning to classify risk into three levels: Low (0), Medium (1), High (2).
  • Visualizations (maps, confusion matrices, feature importance, and metrics dashboards).
  • Scenario-based execution to build diverse and reproducible datasets.

🔧 Key capabilities

  • 🧮 Numerical model: explicit finite-difference scheme with stability checks (CFL and diffusion).
  • 🔬 Scenarios: multiple predefined scenarios (velocities/discharges and source positions) configurable via YAML.
  • 🤖 Risk classification: Random Forest, SVM, Gradient Boosting, Logistic Regression; cross-validation and hyperparameter tuning.
  • 🎨 Reports and plots: comparative dashboards, detailed confusion matrices, and feature-importance ranking.
  • 💾 Export: outputs in NPY and CSV for interoperability (Excel/R/MATLAB/Python).

✨ Features

🧮 Numerical simulation (contaminant transport)

  • Solves the 2D advection–diffusion equation with first-order decay.
  • Supports Dirichlet, Neumann, and mixed boundary conditions.
  • Models sources with configurable location, intensity, and duration.
  • Stores time history and final state for analysis and training.

🤖 Machine learning (risk classifier)

  • Supports multiple algorithms and selects the best via cross-validation.
  • Hyperparameter tuning with GridSearchCV.
  • Two feature sets:
    • Fundamental (8): base parameters (source, velocities, position, normalized time).
    • Complete (16): fundamental + derived variables (distances, travel times, Péclet numbers, etc.).

🎥 Visualization and analysis

  • GIFs and snapshots for spatio-temporal evolution of concentration and risk.
  • Model comparison plots and metrics dashboards.
  • Confusion matrix (absolute and normalized) for detailed inspection.

📦 Installation & Setup

💻 System requirements

Component Minimum Recommended
Python 3.8+ 3.9+
RAM 8 GB 16 GB+ (if exporting full history)
CPU 4 cores 8+ cores
Storage 2 GB 10 GB+ (datasets + results)
OS Windows/Linux/macOS Linux (better for large batches)

📋 Dependencies

# Scientific computing
numpy>=1.21.0
pandas>=1.3.0
scipy>=1.7.0

# Machine learning
scikit-learn>=1.0.0
joblib>=1.1.0

# Visualization
matplotlib>=3.5.0
seaborn>=0.11.0

# Utilities
PyYAML>=6.0
tqdm>=4.62.0

Quick Installation

# Method 1: direct install
git clone https://github.com/gstinoco/mGFD_EcoRisk_Simulator.git
cd mGFD_EcoRisk_Simulator
pip install -r requirements.txt

# Method 2: virtual environment (recommended)
python -m venv contaminant_env
source contaminant_env/bin/activate  # Windows: contaminant_env\Scripts\activate
pip install -r requirements.txt

✅ Quick sanity check

python -c "import numpy, pandas, sklearn, matplotlib, seaborn, yaml; print(':white_check_mark: OK')"
python main.py --help

🚀 Quick Start

Step What to do
1) Install
pip install -r requirements.txt
2) Run
python main.py --complete
3) Check outputs Simulations: data/simulations/
Dataset: data/processed/
Metrics / model: data/results/
Visualizations: data/visualizations/ (or docs/ for demos)

⚡ Typical flows (CLI)

# Full pipeline (complete features = default)
python main.py --complete

# Full pipeline using only fundamental features (8)
python main.py --complete --fundamental-features

# Simulate a specific scenario
python main.py --simulate --scenario baseline_left_center

# Simulate all scenarios defined in config/parameters.yaml
python main.py --simulate --all-scenarios

# Preprocess, train, and visualize (separately)
python main.py --preprocess
python main.py --train
python main.py --visualize

# Generate GIFs and snapshots
python main.py --create-videos
python main.py --create-snapshots --snapshots-count 6

📖 Usage Guide

Practical workflows for simulation, preprocessing, training, and visual analysis

🧮 Simulation (YAML → concentration fields)

Step What it does
1) Configure Edit config/parameters.yaml (domain, physics, source, boundaries, scenarios).
2) Run
python main.py --simulate --scenario baseline_lower
3) Outputs NPY/CSV are saved to data/simulations/<scenario>/.

🏭 Scenario-based generation (synthetic dataset)

Step What it does
1) Run batch
python main.py --simulate --all-scenarios
2) Dataset Scenarios create diversity (source position, flow/discharge).

⚙️ Preprocessing (simulations → feature matrix)

Step What it does
1) Run
python main.py --preprocess
python main.py --preprocess --fundamental-features
2) Outputs data/processed/X_train_*.npy, X_test_*.npy, y_train_*.npy, y_test_*.npy
and feature_names_*.txt

🤖 Training (features → risk model)

Step What it does
1) Train
python main.py --train
python main.py --train --fundamental-features
2) Evaluation Computes per-model metrics and saves the best classifier.
3) Outputs data/results/ (metrics CSV and classifier PKL).

🎨 Visualization (results → plots and dashboards)

Step What it does
1) Visualize
python main.py --visualize
python main.py --visualize --fundamental-features
2) Outputs data/visualizations/ with PNGs (model comparison, confusion matrix, feature importance, dashboard).

🎥 Visualizations

🖼️ Demos (GIF)

Contaminant evolution (concentration)

Concentration Evolution

Ecological risk evolution

Risk Evolution

📊 Dashboards & metrics (PNG)

All Models Metrics Dashboard

Confusion Matrix

Feature Importance

📸 Snapshots (example)

Snapshot



⚙️ API Documentation

This project is primarily used as a command-line tool (CLI) via main.py.

Entry point Command Purpose
main.py python main.py --complete Runs the full flow (simulation → dataset → training → visualization)
main.py `python main.py --simulate [--scenario --all-scenarios]`
main.py python main.py --preprocess [--fundamental-features] Builds the ML dataset
main.py python main.py --train [--fundamental-features] Trains models and saves the best one
main.py python main.py --visualize [--fundamental-features] Generates plots/dashboards from results
main.py python main.py --create-videos Generates time-evolution GIFs
main.py python main.py --create-snapshots --snapshots-count N Generates snapshots at selected times

For full help:

python main.py --help

🗄️ Data Formats

💾 Simulation (NPY / CSV)

Each scenario stores (at minimum) the following in data/simulations/<scenario>/:

  • final_concentration.npy: final field C(x,y,t_final)
  • concentration_history.npy: time history (can be large)
  • x_coordinates.npy, y_coordinates.npy: spatial axes
  • times.npy: time vector
  • parameters.yaml: effective parameters used (base + scenario overrides)

If output.export_csv: true in config/parameters.yaml, it also exports:

  • final_concentration.csv
  • coordinates.csv
  • times.csv
  • concentration_history.csv (only if output.csv_include_history: true, can be very large)

🤖 ML dataset (NPY)

In data/processed/ it stores (with compatibility suffixes):

  • Complete features (16): *_complete.npy and feature_names_complete.txt
  • Fundamental features (8): *_fundamental.npy and feature_names_fundamental.txt

Targets:

  • y_*: risk labels 0/1/2 for (Low/Medium/High).

📂 Project Architecture

.
├─ config/
│  └─ parameters.yaml           # Model, ML, visualization, and scenario parameters
├─ data/
│  ├─ simulations/              # Per-scenario outputs (NPY/CSV + parameters)
│  ├─ processed/                # ML matrices (X/y + feature names)
│  └─ results/                  # Metrics and trained models (CSV/PKL)
├─ docs/
│  ├─ images/                   # Dashboards and snapshots
│  ├─ videos/                   # Concentration and risk GIFs
│  └─ logo/                     # Logos
├─ src/
│  ├─ numerical_model/          # Advection–diffusion equation (FD)
│  ├─ ml_model/                 # Preprocessing and risk classifier
│  └─ visualization/            # Plots, dashboards, GIFs and snapshots
└─ main.py                      # CLI and full-flow orchestration

Core components:

  • src/numerical_model/advection_diffusion.py: 2D transport solver.
  • src/ml_model/data_preprocessing.py: feature extraction + risk labels.
  • src/ml_model/risk_classifier.py: model training/evaluation/persistence.
  • src/visualization/visualization.py: visualization and export.

📚 Mathematical Model

📚 Governing equation

Contaminant transport is modeled with the 2D advection–diffusion equation with decay:

$$ \frac{\partial C}{\partial t} = u \frac{\partial C}{\partial x} + v \frac{\partial C}{\partial y} + D \left( \frac{\partial^2 C}{\partial x^2} + \frac{\partial^2 C}{\partial y^2} \right) + S - k C $$

Where:

  • C: contaminant concentration [mg/L]
  • u, v: advection velocities [m/s]
  • D: diffusion coefficient [m²/s]
  • S: source (injection) [mg/(L·s)]
  • k: decay rate [1/s]

🚩 Boundary conditions

Configurable in config/parameters.yaml as:

  • Dirichlet: fixed concentration at the boundary.
  • Neumann: fixed gradient/flux (open outflow).
  • Mixed: per-side combination.

⚠️ Numerical stability (explicit scheme)

The solver prints typical checks:

  • CFL condition for advection.
  • Stability condition for diffusion.

If violated, adjust dt, dx, dy, or physical parameters in the configuration.


🗄️ Dataset Structure

This project can operate as:

  • A dataset generator (scenario-driven) from simulations.
  • An ML pipeline for risk classification using previously generated datasets.

Main layout:

data/
├─ simulations/
│  ├─ baseline_left_center/
│  ├─ baseline_lower/
│  └─ baseline_upper/
├─ processed/
│  ├─ X_train_complete.npy
│  ├─ X_test_complete.npy
│  ├─ y_train_complete.npy
│  ├─ y_test_complete.npy
│  ├─ feature_names_complete.txt
│  ├─ X_train_fundamental.npy
│  ├─ X_test_fundamental.npy
│  ├─ y_train_fundamental.npy
│  ├─ y_test_fundamental.npy
│  └─ feature_names_fundamental.txt
└─ results/
   ├─ all_models_metrics_report.csv
   ├─ all_models_metrics_report (fundamental features).csv
   ├─ risk_classifier_model.pkl
   └─ risk_classifier_model (fundamental features).pkl

📈 Performance Benchmarks

🏆 Reported metrics (example)

Reference results (files in data/results/):

Feature set Best model (Accuracy) File
Complete (16) GradientBoosting (0.9997) all_models_metrics_report.csv
Fundamental (8) GradientBoosting (0.9893) all_models_metrics_report (fundamental features).csv

Note: Results depend on configuration, sampling, and the available scenarios.


🤝 Contributing

🌟 Contribute to the Project

Bug reports, feature requests, and pull requests are welcome

Issues Pull Requests

🐛 Bug Reports

  1. Search existing issues: Check if the bug has already been reported
  2. Create a detailed report: Include steps to reproduce and expected vs actual behavior
  3. Provide context: Operating system, Python version, browser, and relevant parameters (image size, regions, method)

💡 Feature Requests

  1. Describe the feature: Clear and concise description of the proposed functionality
  2. Justify the need: Explain how it benefits research, reproducibility, or usability
  3. Provide examples: Use cases, expected inputs/outputs, and acceptance criteria

💻 Code Contributions

git clone https://github.com/gstinoco/mGFD_EcoRisk_Simulator.git
cd mGFD_EcoRisk_Simulator

python -m venv dev_env
source dev_env/bin/activate  # On Windows: dev_env\Scripts\activate
pip install -r requirements.txt

git checkout -b feature/your-feature-name

🧑‍🔬 Research Team

🌟 Meet the Team

Researchers and graduate students advancing meshless computational methods

👥 Main Researchers

Photo Researcher Affiliation Contact
Dr. Gerardo Tinoco Guerrero Dr. Gerardo Tinoco Guerrero 🇲🇽
Numerical Methods & Computational Mathematics
Company: SIIIA MATH
University: UMSNH
Contact
ORCID 0000-0003-3119-770X
ResearchGate Profile
Dr. Francisco Javier Domínguez Mota Dr. Francisco Javier Domínguez Mota 🇲🇽
Applied Mathematics & Finite Difference Methods
Company: SIIIA MATH
University: UMSNH
Contact
ORCID 0000-0001-6837-172X
ResearchGate Profile
Dr. José Alberto Guzmán Torres Dr. José Alberto Guzmán Torres 🇲🇽
Engineering Applications & Artificial Intelligence
Company: SIIIA MATH
University: UMSNH
Contact
ORCID 0000-0002-9309-9390
ResearchGate Profile
Dr. Heriberto Árias Rojas Dr. Heriberto Árias Rojas 🇲🇽
Engineering Applications
Company: SIIIA MATH
University: UMSNH
Contact
ORCID 0000-0002-7641-8310
ResearchGate Profile

🎓 Ph.D. Research Students

Photo Student Institution Contact
Gabriela Pedraza-Jiménez Gabriela Pedraza-Jiménez
Ph.D. Research Student
University: UMSNH Contact
Eli Chagolla-Inzunza Eli Chagolla-Inzunza
Ph.D. Research Student
University: UMSNH Contact

🎓 M.Sc. Research Students

Photo Student Institution Contact
Jorge L. González-Figueroa Jorge L. González-Figueroa
M.Sc. Research Student
University: UMSNH Contact
Christopher N. Magaña-Barocio Christopher N. Magaña-Barocio
M.Sc. Research Student
University: UMSNH Contact

🎓 Undergraduate Research Students

Photo Student Institution Contact
Maria Goretti Fraga Lopez Maria Goretti Fraga-Lopez
Undergraduate Research Student
University: UMSNH Contact

🏭 Industry Partners Supporting Innovation

🌟 Industry Partners Supporting Innovation

Collaboration between academia and industry to accelerate real-world impact

🏭 SIIIA MATH

Soluciones de Ingeniería, México

Website Type Location

🎯 Focus areas:

  • Mathematical modeling & simulation
  • AI/ML engineering solutions
  • Technology transfer and applied R&D

Contact


📚 Scientific References

📚 Core Publications (GFD / mGFD Background)

  1. Tinoco-Guerrero, G., Domínguez-Mota, F. J., Guzmán-Torres, J. A., & Tinoco-Ruiz, J. G. (2022). "Numerical Solution of Diffusion Equation using a Method of Lines and Generalized Finite Differences." Revista Internacional de Métodos Numéricos para Cálculo y Diseño en Ingeniería, 38(2). DOI: 10.23967/j.rimni.2022.06.003

🏆 Project Highlights

  • Contour-to-cloud pipeline: interactive image-based contour extraction and multi-region management
  • Cloud generation methods: Regular (grid-like) and Natural (Poisson disk sampling) distributions
  • Region-aware analysis: neighbor computation constrained by region labels for disconnected domains and holes

📝 Citation & License

If you use this software in your research, please cite:

@software{tinoco2025mGFD_cloudgenerator,
  title={mGFD CloudGenerator 2.0: Web platform for generating 2D unstructured point clouds},
  author={Tinoco-Guerrero, Gerardo and 
          Domínguez-Mota, Francisco Javier and 
          Guzmán-Torres, José Alberto and
          Arias-Rojas, Heriberto},
  year={2025},
  institution={Universidad Michoacana de San Nicolás de Hidalgo},
  organization={SIIIA MATH: Soluciones en ingeniería},
  url={https://github.com/gstinoco/mGFD_EcoRisk_Simulator},
  version={2.0},
  note={Web-based preprocessing tool for meshless mGFD workflows: image-to-contour extraction, multi-region handling, point-cloud generation (regular/Poisson), node classification, and region-constrained neighbor analysis}
}

📄 License

This project is licensed under the MIT License - see the full license text below:

MIT License

Copyright (c) 2025 Gerardo Tinoco-Guerrero, Francisco Javier Domínguez-Mota, 
                   José Alberto Guzmán-Torres, Heriberto Árias Rojas

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Academic Use: This software is developed for research and educational purposes. Commercial use is permitted under the MIT License terms.


🙏 Acknowledgments

❤️ Special Thanks

We extend our gratitude to the institutions and partners supporting this research and open-source development

🏛️ Institutional Support

🎓 Universidad Michoacana de San Nicolás de Hidalgo (UMSNH)
Academic institution, Mexico

Website Type: University Support: Infrastructure

Key support
  • Academic foundation and research infrastructure
  • Scientific training and supervision environment
🏛️ Secretariat of Science, Humanities, Technology and Innovation(SECIHTI)
State Secretariat, Mexico

Website Type: Government Support: Funding and Innovation

Key support
  • Support for science and technology initiatives
  • Funding and innovation promotion
🌿 Centre Internacional de Mètodes Numèrics en Enginyeria (CIMNE)
Industry, Spain

Website Type: Research Center Support: Collaboration

Key support
  • International collaboration in numerical methods
  • Computational engineering research environment
🏭 SIIIA MATH: Soluciones en Ingeniería
Industry, México

Website Type: Industry Partner Support: Technology Transfer

Key support
  • Industry-driven applied research and development
  • Technology transfer and practical engineering impact

:building_with_garden: Research Centers & Collaborations

🌿 Aula CIMNE-Morelia
Research collaboration space

Website Area: Numerical Methods Collaboration: Applied Computing

Collaboration highlights
  • Numerical methods and computational engineering environment
  • Academic–industry collaboration and training activities
🎓 UMSNH
Academic collaboration

Website Type: University Support: Research Infrastructure

Collaboration highlights
  • Institutional infrastructure supporting research and training
  • Graduate formation and supervision for scientific computing

💻 Technology Communities

📦 Framework 👥 Community ⭐ Contribution
OpenCV OpenCV Community Computer vision and image processing
Flask Flask Development Team Web framework
NumPy NumPy Community Array computing foundation
SciPy SciPy Community Numerical algorithms
Matplotlib Matplotlib Community Scientific visualization
Shapely Shapely Development Team Computational geometry

📧 Contact & Support

Contact channels, technical support, and collaboration opportunities

Issues Email

Primary Contact
Research group coordination

Dr. Gerardo Tinoco Guerrero
Morelia, Michoacán, México

Email Company: SIIIA MATH University: UMSNH
Technical Support
Bug reports, questions, and collaboration requests

Open an Issue Send Email Request Collaboration

  • Issues for bugs and feature requests
  • Email for technical inquiries
  • Collaboration for partnerships and joint projects
Collaboration Opportunities
Research and engineering partnerships

🧮 Meshless Methods
mGFD discretizations, boundary handling, point cloud quality
📐 Computational Geometry
polygon processing, hole handling, robust point-in-region tests
🖼️ Computer Vision
segmentation workflows, contour extraction from images
🌐 Scientific Web Tools
reproducible preprocessing platforms for simulation pipelines
🌊 CFD / Engineering
node generation for complex domains and multi-region problems
Student Opportunities
Projects and training in scientific computing

  • Graduate Programs: research opportunities with the team
  • Undergraduate Projects: thesis topics in computational engineering
  • Internships: scientific computing, numerical methods, and applied modeling
Institutional Affiliations

SIIIA MATH UMSNH Research Group

💬 FAQ

Which image formats are supported?
PNG, JPG/JPEG, GIF, and BMP. Maximum request size is 16 MB.
Where are outputs saved when running locally?
Generated files are written to output/. Uploaded files are stored in uploads/. Both folders are created automatically on startup.
What is the expected CSV format for CloudGenerator?
Contours: x,y,region (region is optional, but recommended for multi-region). Clouds: x,y,region,classification.
Can I use this in commercial projects?
Yes. The project is released under the MIT License.
How should I cite this work?
Use the BibTeX entry in the Citation section and the referenced DOI in Scientific References.

Advancing meshless methods through open-source collaboration

GitHub stars GitHub forks GitHub watchers


If this project helps your research, please consider giving it a star.

Releases

No releases published

Packages

 
 
 

Contributors

Languages