Skip to content

Open-source SRE lab environment for learning and demonstrating observability, monitoring, and incident response workflows. Includes a full Prometheus + Grafana + Loki + Promtail stack orchestrated with Docker Compose, ready for extension to Kubernetes and alert automation.

Notifications You must be signed in to change notification settings

pmoise1981/sre-lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 SRE Lab – Observability & Monitoring Stack

This project sets up a hands-on Site Reliability Engineering (SRE) lab that demonstrates how to monitor applications and infrastructure using a modern open-source observability stack.

It includes preconfigured services for metrics, logs, and dashboards, designed to mirror production-grade reliability workflows.


🧩 Features

  • Prometheus – collects and stores time-series metrics from monitored services
  • Grafana – visualizes metrics and builds alert dashboards
  • Loki & Promtail – centralized log aggregation and querying
  • Docker Compose – orchestrates multi-container setup locally
  • Environment Variables (.env) – configurable ports, data paths, and credentials
  • Modular design – easily extendable to Kubernetes, Alertmanager, or Slack alerting

βš™οΈ Architecture Overview

                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                β”‚  Promtail  │───► Logs ───► Loki
                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  App (Flask│───►│ Prometheus │───►│  Grafana   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Getting Started

1️⃣ Clone the Repository

git clone https://github.com/pmoise1981/sre-lab.git
cd sre-lab

2️⃣ Create Environment File

Copy the example environment file:

cp .env.example .env

3️⃣ Start the Stack

docker compose up -d

Grafana will be available at: http://localhost:3000 Prometheus at: http://localhost:9090


πŸ“Š Prebuilt Dashboards

  • System Metrics Dashboard: CPU, memory, disk usage
  • Container Health Dashboard: Uptime, restart count, latency
  • Application Metrics (optional): Integrates with Flask or FastAPI exporters

🧱 Tech Stack

Prometheus Β· Grafana Β· Loki Β· Promtail Β· Docker Compose Β· Linux Β· .env


🧩 Future Enhancements

  • Add Alertmanager + Slack/Email alerts
  • Add Service-Level Indicators (SLIs) and Service-Level Objectives (SLOs)
  • Integrate OpenTelemetry exporters for tracing
  • Add Kubernetes manifests for production-grade orchestration

πŸ‘¨πŸΎβ€πŸ’» Author

Pierre Moise Site Reliability & DevOps Engineer | Observability, CI/CD, Cloud Automation πŸ“Ž GitHub

About

Open-source SRE lab environment for learning and demonstrating observability, monitoring, and incident response workflows. Includes a full Prometheus + Grafana + Loki + Promtail stack orchestrated with Docker Compose, ready for extension to Kubernetes and alert automation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •