A state-of-the-art, high-performance, AI-driven framework for seismic signal reconstruction, denoising, and geophysical data enhancement. Developed in December 2025 with cutting-edge deep learning architectures and production-grade engineering practices.
Promethium: Illuminating hidden signals within seismic noise.
- Overview
- Key Features
- Architectural Overview
- Repository Structure
- Technology Stack
- Installation and Setup
- Quick Start
- Usage Examples
- AI/ML and Data Engineering Highlights
- Performance and Benchmarking
- Configuration
- Development Guide
- Contributing
- License and Non-Commercial Use
- Citation
- Support and Contact
Promethium is a comprehensive, enterprise-grade, state-of-the-art framework designed to address the critical challenges of seismic data recovery, reconstruction, and enhancement. Initiated in December 2025, the framework integrates cutting-edge signal processing techniques with advanced artificial intelligence and machine learning models to deliver unprecedented quality in seismic data reconstruction. The system represents the convergence of the latest advances in deep learning (including transformer architectures, physics-informed neural networks, and neural operators) with robust, production-ready data engineering practices.
Promethium serves professionals and researchers across multiple geophysical and seismological domains:
- Exploration Geophysics: Enhancing subsurface imaging for oil, gas, and mineral exploration through improved seismic reflection and refraction data quality.
- Reservoir Characterization: Enabling high-fidelity seismic attribute analysis for reservoir property estimation and fluid identification.
- Earthquake Seismology: Supporting earthquake monitoring networks with robust signal reconstruction for accurate source characterization.
- Microseismic Monitoring: Processing low signal-to-noise ratio microseismic events for hydraulic fracturing monitoring and induced seismicity analysis.
- Engineering Seismology: Providing enhanced ground motion records for seismic hazard assessment and structural engineering applications.
The framework addresses fundamental data quality challenges inherent in seismic acquisition:
- Missing Trace Reconstruction: Interpolating gaps caused by acquisition geometry constraints, equipment failures, or access limitations.
- Noise Attenuation: Suppressing coherent and incoherent noise while preserving signal integrity and phase characteristics.
- Signal Enhancement: Improving signal-to-noise ratios through adaptive filtering and AI-driven denoising algorithms.
- Data Regularization: Converting irregularly sampled data to regular grids suitable for downstream processing workflows.
- Multi-Format Support: Native reading and writing of industry-standard formats including SEG-Y (Rev 0, 1, 2), SEG-2, miniSEED, SAC, and GCF.
- Metadata Preservation: Complete header parsing and preservation throughout processing workflows.
- Streaming Ingestion: Memory-efficient streaming for large-scale datasets exceeding available RAM.
- Quality Control: Automated detection of trace anomalies, timing errors, and format inconsistencies.
- Adaptive Filtering: Time-varying and spatially-varying filter design incorporating local signal characteristics.
- Spectral Analysis: Multi-taper spectral estimation, spectrogram computation, and coherence analysis.
- Time-Frequency Transforms: Continuous and discrete wavelet transforms, S-transform, and matching pursuit decomposition.
- Deconvolution: Predictive deconvolution, spiking deconvolution, and minimum-phase wavelet estimation.
- Velocity Analysis: Semblance analysis, velocity spectrum computation, and NMO correction.
- U-Net Architectures: Encoder-decoder networks with skip connections optimized for seismic trace reconstruction, including the latest Attention U-Net and Residual U-Net variants.
- Variational Autoencoders: Probabilistic generative models for uncertainty-aware reconstruction with state-of-the-art latent space regularization.
- Generative Adversarial Networks: Adversarial training for high-fidelity missing data synthesis using modern GAN architectures adapted for scientific data.
- Physics-Informed Neural Networks (PINNs): Incorporating wave equation constraints into network training for physically consistent reconstructions, representing the cutting edge of scientific machine learning.
- Transformer Models: State-of-the-art attention-based architectures including Vision Transformers and Swin Transformers for capturing long-range spatial dependencies in seismic gathers.
- Neural Operators: Fourier Neural Operators (FNO) and DeepONet frameworks for learning solution operators of seismic wave propagation, representing 2025's most advanced approaches in operator learning.
- Wave Equation Constraints: Embedding acoustic and elastic wave equation residuals into loss functions.
- Velocity Model Integration: Conditioning reconstruction on prior velocity model information.
- Travel-Time Consistency: Enforcing moveout relationships in reconstructed gathers.
- Amplitude Variation with Offset: Preserving AVO/AVA characteristics through physics-aware training.
- Distributed Processing: Horizontal scaling across compute clusters using task queues and worker pools.
- Batch Orchestration: Pipeline-based processing of large survey datasets with checkpoint and resume capabilities.
- Data Versioning: Immutable data storage with complete lineage tracking and reproducibility.
- Storage Backends: Support for local filesystems, object storage (S3-compatible), and distributed filesystems.
- Interactive Visualization: Real-time rendering of seismic traces, gathers, and sections with customizable color palettes.
- Job Management: Comprehensive interface for submitting, monitoring, and managing processing jobs.
- Configuration UI: Form-based configuration of processing parameters with validation and presets.
- Result Comparison: Side-by-side visualization of input and reconstructed data with difference displays.
- Export Functionality: Download of processed data, reports, and visualizations in multiple formats.
- Containerized Deployment: Production-ready Docker images with multi-stage builds for minimal footprint.
- Orchestration Ready: Kubernetes manifests and Helm charts for cloud-native deployment.
- API-First Design: RESTful API enabling integration with existing geophysical workflows and third-party applications.
- Extensibility: Plugin architecture for custom format readers, processing algorithms, and ML models.
Promethium implements a state-of-the-art modular, layered architecture designed for maintainability, scalability, and extensibility. The system leverages cutting-edge technologies and best practices from both industry and academia to deliver production-grade seismic data processing capabilities.
flowchart TB
subgraph Frontend["Angular Frontend"]
FE[Visualization / Job Management / Configuration / Monitoring]
end
subgraph API["FastAPI Backend"]
BE[Authentication / Request Validation / Job Submission / Results]
end
subgraph DataLayer["Data Layer"]
PG[(PostgreSQL<br/>Metadata, Jobs, Users)]
RD[(Redis<br/>Task Queue, Caching)]
OS[(Object Storage<br/>Raw Data, Models, Results)]
end
subgraph Workers["Celery Worker Pool"]
WK[Distributed Task Execution / GPU Workloads]
end
subgraph Core["Promethium Core Library"]
IO[I/O Module<br/>Format R/W]
SIG[Signal Module<br/>Processing]
ML[ML Module<br/>Models, Training]
WF[Workflows<br/>Pipelines]
VAL[Validation<br/>QC Checks]
UTL[Utilities<br/>Logging, Config]
end
Frontend -->|REST API / HTTPS| API
API --> PG
API --> RD
API --> OS
RD --> Workers
Workers --> Core
Core --> OS
IO --- SIG
SIG --- ML
WF --- VAL
VAL --- UTL
For comprehensive architectural documentation including component interactions, data flows, and deployment topologies, refer to docs/architecture.md.
promethium/
├── src/
│ └── promethium/
│ ├── __init__.py
│ ├── core/ # Core utilities, configuration, exceptions
│ │ ├── __init__.py
│ │ ├── config.py
│ │ ├── exceptions.py
│ │ └── logging.py
│ ├── io/ # Data ingestion and export
│ │ ├── __init__.py
│ │ ├── segy.py
│ │ ├── miniseed.py
│ │ ├── sac.py
│ │ └── formats.py
│ ├── signal/ # Signal processing algorithms
│ │ ├── __init__.py
│ │ ├── filtering.py
│ │ ├── transforms.py
│ │ ├── deconvolution.py
│ │ └── spectral.py
│ ├── ml/ # Machine learning models and training
│ │ ├── __init__.py
│ │ ├── models/
│ │ │ ├── unet.py
│ │ │ ├── autoencoder.py
│ │ │ ├── gan.py
│ │ │ └── pinn.py
│ │ ├── training.py
│ │ ├── inference.py
│ │ └── metrics.py
│ ├── api/ # FastAPI backend application
│ │ ├── __init__.py
│ │ ├── main.py
│ │ ├── routers/
│ │ ├── models/
│ │ ├── services/
│ │ └── dependencies.py
│ └── workflows/ # Pipeline orchestration
│ ├── __init__.py
│ ├── pipelines.py
│ └── tasks.py
├── frontend/ # Angular web application
│ ├── src/
│ │ ├── app/
│ │ ├── assets/
│ │ └── environments/
│ ├── angular.json
│ ├── package.json
│ └── tsconfig.json
├── config/ # Configuration files
│ ├── default.yaml
│ ├── production.yaml
│ └── development.yaml
├── docker/ # Docker configurations
│ ├── Dockerfile.backend
│ ├── Dockerfile.frontend
│ ├── Dockerfile.worker
│ └── docker-compose.yml
├── tests/ # Test suites
│ ├── unit/
│ ├── integration/
│ └── e2e/
├── docs/ # Documentation
│ ├── overview.md
│ ├── architecture.md
│ ├── user-guide.md
│ └── ...
├── notebooks/ # Jupyter notebooks for exploration
│ ├── demo_reconstruction.ipynb
│ └── model_training.ipynb
├── assets/ # Static assets
│ └── branding/
│ └── promethium-logo.png
├── scripts/ # Utility scripts
│ ├── setup_db.py
│ └── generate_docs.py
├── README.md
├── CONTRIBUTING.md
├── CODE_OF_CONDUCT.md
├── SECURITY.md
├── CHANGELOG.md
├── CITATION.md
├── SUPPORT.md
├── GOVERNANCE.md
├── LICENSE
├── pyproject.toml
└── .gitignore
| Directory | Purpose |
|---|---|
src/promethium/core/ |
Core utilities including configuration management, custom exception hierarchy, and structured logging. |
src/promethium/io/ |
Format-specific readers and writers for seismic data formats with metadata handling. |
src/promethium/signal/ |
Signal processing implementations including filtering, spectral analysis, and transforms. |
src/promethium/ml/ |
Machine learning model definitions, training loops, inference pipelines, and evaluation metrics. |
src/promethium/api/ |
FastAPI application with routers, request/response models, and business logic services. |
src/promethium/workflows/ |
High-level pipeline definitions and Celery task implementations. |
frontend/ |
Angular single-page application with components, services, and state management. |
config/ |
YAML configuration files for different deployment environments. |
docker/ |
Dockerfiles and orchestration configurations for containerized deployment. |
tests/ |
Comprehensive test suites organized by testing scope. |
docs/ |
Technical documentation in Markdown format. |
notebooks/ |
Interactive Jupyter notebooks for experimentation and demonstration. |
assets/ |
Static assets including branding materials and sample data. |
scripts/ |
Administrative and utility scripts for development and deployment. |
| Component | Technology | Purpose |
|---|---|---|
| Runtime | Python 3.10+ | Core application runtime |
| Web Framework | FastAPI | Asynchronous REST API with automatic OpenAPI documentation |
| Task Queue | Celery | Distributed task execution for compute-intensive operations |
| Message Broker | Redis | Task queue backend and result caching |
| Database | PostgreSQL | Persistent storage for metadata, jobs, and user management |
| ORM | SQLAlchemy | Database abstraction and query building |
| Migrations | Alembic | Database schema version control and migrations |
| Authentication | python-jose, passlib | JWT-based authentication and password hashing |
| Validation | Pydantic | Request/response validation and serialization |
| Component | Technology | Purpose |
|---|---|---|
| Deep Learning | PyTorch | Neural network definition and training |
| Scientific Computing | NumPy, SciPy | Numerical operations and signal processing |
| Seismic Processing | ObsPy | Seismic data handling and format support |
| Data Structures | xarray | Multi-dimensional labeled array operations |
| Data Loading | PyTorch DataLoader | Efficient batched data loading with prefetching |
| Model Serving | TorchServe (optional) | Production model serving infrastructure |
| Experiment Tracking | MLflow | Model versioning, metrics tracking, and artifact storage |
| Component | Technology | Purpose |
|---|---|---|
| Framework | Angular 17+ | Single-page application framework |
| Language | TypeScript | Type-safe JavaScript development |
| State Management | NgRx | Reactive state management with Redux pattern |
| Reactive Extensions | RxJS | Reactive programming for asynchronous operations |
| UI Components | Angular Material | Material Design component library |
| HTTP Client | Angular HttpClient | API communication with interceptors |
| Visualization | D3.js, Plotly | Interactive seismic data visualization |
| Build System | Angular CLI | Development server, building, and testing |
| Component | Technology | Purpose |
|---|---|---|
| Containerization | Docker | Application containerization and isolation |
| Orchestration | Docker Compose | Multi-container local orchestration |
| CI/CD | GitHub Actions | Automated testing, building, and deployment |
| Code Quality | Black, Ruff, ESLint, Prettier | Code formatting and linting |
| Testing | pytest, Karma, Jasmine | Unit and integration testing frameworks |
| Documentation | MkDocs (optional) | Documentation site generation |
Promethium is a state-of-the-art (SoTA) multi-language framework with native implementations in Python, R, Julia, and Scala.
Install Promethium directly from PyPI:
pip install promethium-seismic==1.0.4PyPI Package: https://pypi.org/project/promethium-seismic/
# Visualization support
pip install promethium-seismic[viz]==1.0.4
# Server components (FastAPI, Celery, Redis)
pip install promethium-seismic[server]==1.0.4
# All optional dependencies
pip install promethium-seismic[all]==1.0.4
# Development dependencies
pip install promethium-seismic[dev]==1.0.4git clone https://github.com/olaflaitinen/promethium.git
cd promethium
pip install -e ".[dev]"The R implementation will be available as promethiumR.
Target CRAN Package: https://CRAN.R-project.org/package=promethiumR
# Coming soon
install.packages("promethiumR")
library(promethiumR)The Julia implementation will be available as Promethium.jl.
Target Julia Package: https://juliahub.com/ui/Packages/Promethium
# Coming soon
using Pkg
Pkg.add("Promethium")
using PromethiumThe Scala implementation is available with Maven coordinates io.github.olaflaitinen:promethium-scala.
Maven Central: https://central.sonatype.com/artifact/io.github.olaflaitinen/promethium-scala_2.13
// Add to build.sbt
libraryDependencies += "io.github.olaflaitinen" %% "promethium-scala" % "1.0.4"For detailed package distribution and publication information, see docs/distribution.md.
For detailed information on the mathematical models, algorithms, and methodologies used in Promethium, please refer to the Math & Methodology Guide.
Promethium supports offline and source-based usage for Kaggle competitions and notebook environments.
Quick References:
- Kaggle Integration Guide (Full Documentation)
- Source Import Example
Attach the Promethium Source Dataset to your notebook:
import sys
sys.path.append("/kaggle/input/promethium-source")
import promethium
from promethium import read_segy, SeismicRecoveryPipeline
# Load seismic data
data = read_segy("/kaggle/input/seismic-dataset/survey.sgy")
# Create and run reconstruction pipeline
pipeline = SeismicRecoveryPipeline.from_preset("unet_denoise_v1")
result = pipeline.run(data)
# Evaluate reconstruction quality
metrics = promethium.evaluate_reconstruction(data.values, result)
print(metrics)Key considerations for notebook environments:
- GPU acceleration is automatically enabled when available
- The core library works on CPU-only environments
- Use
/kaggle/input/...paths for Kaggle datasets - Use
/content/...paths for Colab uploaded files
For full development or server deployment, ensure the following software is installed:
- Python: Version 3.10 or higher
- Node.js: Version 20 or higher (for frontend development)
- Docker: Version 24 or higher (for containerized deployment)
- Docker Compose: Version 2.20 or higher
git clone https://github.com/olaflaitinen/promethium.git
cd promethiumThe recommended approach for running the full Promethium server stack:
# Copy environment template and configure
cp .env.example .env
# Edit .env with your configuration
# Build and start all services
docker compose -f docker/docker-compose.yml up --build -d
# Verify services are running
docker compose -f docker/docker-compose.yml psThe following services will be available:
| Service | URL | Description |
|---|---|---|
| Frontend | http://localhost:4200 | Angular web application |
| Backend API | http://localhost:8000 | FastAPI REST API |
| API Documentation | http://localhost:8000/docs | Interactive OpenAPI documentation |
For development without Docker:
# Create and activate virtual environment
python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/macOS
source .venv/bin/activate
# Install in editable mode with dev dependencies
pip install -e ".[dev,server]"
# Run tests
pytest tests/ -v
# Start the backend API server
uvicorn src.promethium.api.main:app --reload --host 0.0.0.0 --port 8000This section provides a minimal end-to-end workflow to verify your Promethium installation.
Using Docker Compose:
docker compose -f docker/docker-compose.yml up -dOpen your browser and navigate to http://localhost:4200.
- Navigate to the Data section in the sidebar.
- Click Upload and select a SEG-Y file from the
assets/sample_data/directory. - Wait for the upload and initial validation to complete.
- Navigate to the Jobs section.
- Click New Job and select Reconstruction as the job type.
- Select your uploaded dataset as the input.
- Choose a reconstruction model (e.g.,
unet-v2-noise-reduction). - Configure parameters or use defaults.
- Click Submit.
- The job will appear in the Jobs list with status updates.
- Click on the job to view detailed progress and logs.
- Once the job completes, navigate to Results.
- Select the completed job to view reconstructed data.
- Use the comparison view to see input versus output.
- Export results in your preferred format.
from promethium.io import read_segy
from promethium.signal import bandpass_filter
from promethium.ml import load_model, reconstruct
# Load seismic data
data = read_segy("path/to/survey.sgy")
# Apply preprocessing
filtered = bandpass_filter(data, low_freq=5.0, high_freq=80.0)
# Load reconstruction model
model = load_model("unet-v2-reconstruction")
# Perform reconstruction
reconstructed = reconstruct(model, filtered, missing_traces=[10, 15, 23])
# Save results
reconstructed.to_segy("path/to/reconstructed.sgy")curl -X POST "http://localhost:8000/api/v1/jobs" \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"type": "reconstruction",
"input_dataset_id": "uuid-of-dataset",
"model_id": "unet-v2-reconstruction",
"parameters": {
"missing_trace_strategy": "auto_detect",
"output_format": "segy"
}
}'curl -X GET "http://localhost:8000/api/v1/jobs/<job-id>" \
-H "Authorization: Bearer <token>"curl -X GET "http://localhost:8000/api/v1/jobs/<job-id>/results" \
-H "Authorization: Bearer <token>" \
-o reconstructed_data.sgy- Data Upload: Use the drag-and-drop interface to upload SEG-Y or miniSEED files.
- Quality Control: Review automated QC reports highlighting trace anomalies.
- Job Configuration: Use the guided wizard to configure reconstruction parameters.
- Visualization: Interactive trace viewer with zoom, pan, and color scale controls.
- Comparison: Synchronized side-by-side view of original and reconstructed data.
- Export: Download processed data with full header preservation.
Promethium includes a comprehensive suite of 15 Jupyter notebooks for learning and experimentation. All notebooks are located in the notebooks/ directory.
pip install promethium-seismic==1.0.2
jupyter notebook notebooks/- Start with 01_quickstart_basic_usage for a minimal working example
- Continue to 02_data_ingestion to understand data loading
- Explore 03_signal_processing for preprocessing techniques
- Move to 05_deep_learning_unet for ML-based reconstruction
- Use 08_kaggle_and_colab for cloud deployment
See docs/notebooks-overview.md for detailed documentation.
| Model Family | Variants | Use Case |
|---|---|---|
| U-Net | Standard, Attention U-Net, Residual U-Net | General reconstruction, denoising |
| Autoencoder | VAE, Denoising AE, Sparse AE | Feature extraction, compression |
| GAN | Pix2Pix, SRGAN-adapted | High-fidelity reconstruction |
| PINN | Wave-constrained, Velocity-informed | Physics-consistent reconstruction |
| Transformer | Vision Transformer, Swin Transformer | Long-range dependency modeling |
- Data Preparation: Convert seismic data to training-ready format with configurable patch extraction.
- Augmentation: Apply domain-specific augmentations including noise injection, trace masking, and amplitude scaling.
- Training: Distributed training with mixed precision, gradient accumulation, and early stopping.
- Validation: Continuous validation with seismic-specific metrics.
- Checkpointing: Model versioning with MLflow integration.
- Signal-to-Noise Ratio (SNR): Improvement in SNR after reconstruction.
- Structural Similarity Index (SSIM): Perceptual quality measure.
- Mean Squared Error (MSE): Pixel-wise reconstruction error.
- Coherence Preservation: Cross-correlation of reconstructed versus reference.
- Spectral Fidelity: Frequency content preservation analysis.
For detailed ML pipeline documentation, see docs/ml-pipelines.md.
For data engineering patterns and best practices, see docs/data-engineering.md.
| Operation | Target Throughput | Notes |
|---|---|---|
| SEG-Y Ingestion | 500 MB/s | SSD storage, streaming mode |
| Trace Filtering | 10,000 traces/s | Single CPU core |
| U-Net Inference | 100 gathers/s | NVIDIA A100 GPU |
| Reconstruction Job | < 5 min for 10 GB | Full pipeline, GPU-enabled |
Promethium includes a comprehensive benchmarking suite for performance evaluation:
# Run full benchmark suite
python -m promethium.benchmarks.run_all
# Run specific benchmarks
python -m promethium.benchmarks.io_throughput
python -m promethium.benchmarks.ml_inferenceFor detailed benchmarking methodology and result interpretation, see docs/benchmarking.md.
Promethium uses a hierarchical configuration system with the following precedence (highest to lowest):
- Environment variables
- Command-line arguments
- Environment-specific configuration files (
config/{environment}.yaml) - Default configuration (
config/default.yaml)
| Category | Description | Configuration File Section |
|---|---|---|
| Database | PostgreSQL connection parameters | database.* |
| Redis | Redis connection and pool settings | redis.* |
| Storage | Data storage paths and backends | storage.* |
| ML | Model paths, inference settings | ml.* |
| API | Server settings, CORS, rate limiting | api.* |
| Workers | Celery worker configuration | workers.* |
Essential environment variables:
# Database
PROMETHIUM_DATABASE_URL=postgresql://user:password@localhost:5432/promethium
# Redis
PROMETHIUM_REDIS_URL=redis://localhost:6379/0
# Security
PROMETHIUM_SECRET_KEY=your-secret-key-here
PROMETHIUM_JWT_ALGORITHM=HS256
# Storage
PROMETHIUM_DATA_DIR=/data/promethium
PROMETHIUM_MODEL_DIR=/modelsFor comprehensive configuration documentation, see docs/configuration.md.
Promethium enforces consistent code style through automated tooling:
Python:
- Formatter: Black
- Linter: Ruff
- Type Checking: mypy
TypeScript:
- Formatter: Prettier
- Linter: ESLint
- Strict mode enabled
# Backend tests
pytest tests/ -v --cov=src/promethium
# Frontend tests
cd frontend && npm test
# End-to-end tests
pytest tests/e2e/ -vInstall pre-commit hooks to ensure code quality:
pip install pre-commit
pre-commit installFor detailed development workflows, environment setup, and contribution guidelines, see docs/developer-guide.md.
Contributions to Promethium are welcome and appreciated. The project accepts contributions in the following areas:
- Bug reports and feature requests via GitHub Issues
- Code contributions via Pull Requests
- Documentation improvements
- Test coverage enhancements
- Performance optimizations
Before contributing, please review:
- CONTRIBUTING.md for contribution guidelines
- CODE_OF_CONDUCT.md for community standards
- GOVERNANCE.md for project governance
Promethium is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
This license permits:
- Sharing and adapting the material for non-commercial purposes
- Attribution must be given to the original creators
This license prohibits:
- Commercial use without explicit permission
- Sublicensing
For the complete license text, see LICENSE.
For commercial licensing inquiries, please contact the maintainers.
If you use Promethium in academic research, please cite it appropriately. See CITATION.md for recommended citation formats and BibTeX entries.
For support options, community resources, and contact information, see SUPPORT.md.
- Issue Tracker: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: docs/
Promethium - Advancing seismic data science through intelligent reconstruction.
