A curated list of Site Reliability and Production Engineering Tools
-
Updated
Mar 18, 2026
A curated list of Site Reliability and Production Engineering Tools
Lightweight, self-contained Linux® server monitoring tool
A Simple Monitoring Dashboard for Docker Swarm Cluster
Dashboard for Docker Swarm Cluster
📊 Analyze and monitor Microsoft Intune Management Extension logs on Windows for real-time insights and error detection.
Advanced stealth web data collection framework for security
Utility to test and wipe hard disks and SSDs
Data Center workload and software optimizations for Intel hardware.
Awesome Uptime Monitoring
Identify unused resources at Google Cloud Platform through Prometheus' metrics
A collection of scripts that extend EventSentry's functionality.
Network-Based Intrusion Detection System - dev/deploy-ment
My Artificial Intelligence Log Sentinel for Postfix and beyond...
Command line client for interacting with checkson.io
Real-time log file monitoring with pattern highlighting and desktop notifications. Cross-platform Rust CLI tool with regex matching, file rotation support, and desktop notifications.
Wazuh integration to send alerts to Keep (open-source alert management and AIOps platform)
🤖 Simplify IT operations with Wuhr AI Ops, an AI-driven platform for real-time monitoring, log analysis, and seamless CI/CD management.
🖥️ Monitor RAM and CPU usage in Proxmox for hosts, LXC, and QEMU/KVM VMs with clear visuals and detailed metrics for better resource management.
🌐 Explore VandCloud, a cross-platform app to browse, test, and monitor APIs and services with real-time status updates.
Add a description, image, and links to the monitoring-tools topic page so that developers can more easily learn about it.
To associate your repository with the monitoring-tools topic, visit your repo's landing page and select "manage topics."