Monitoring Setup

This guide explains how to set up monitoring for masque-vpn using Prometheus and Grafana.

Prerequisites

Prometheus server
Grafana server
masque-vpn server with metrics enabled

Architecture

The monitoring setup consists of:

masque-vpn server - Exposes metrics on /metrics endpoint
Prometheus - Scrapes and stores metrics
Grafana - Visualizes metrics with dashboards

Configuration

Enable Metrics in masque-vpn

Metrics are enabled by default in the server configuration:

[metrics]
enabled = true
listen_addr = "127.0.0.1:9090"

Note: In the current implementation, metrics are served on the same port as the API server (8080) at the /metrics endpoint.

Prometheus Configuration

Add the following to your prometheus.yml:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'masque-vpn'
    static_configs:
      - targets: ['localhost:8080']  # API server port
    metrics_path: '/metrics'
    scrape_interval: 10s
    scrape_timeout: 5s
    
  # If running multiple servers
  - job_name: 'masque-vpn-cluster'
    static_configs:
      - targets: 
        - 'server1:8080'
        - 'server2:8080'
        - 'server3:8080'
    metrics_path: '/metrics'

Docker Compose Setup

For easy deployment, use the provided docker-compose.yml:

version: '3.8'
services:
  masque-vpn-server:
    build: ./vpn_server
    ports:
      - "4433:4433"  # MASQUE VPN
      - "8080:8080"  # API + Metrics
    volumes:
      - ./cert:/app/cert:ro
      
  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./monitoring/prometheus.yml:/etc/prometheus/prometheus.yml:ro
      
  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - ./monitoring/grafana/dashboards:/var/lib/grafana/dashboards:ro

Grafana Dashboard

Import Dashboard

Open Grafana at http://localhost:3000
Login with admin/admin
Go to "+" → Import
Upload grafana/masque-vpn-dashboard.json

Key Panels

The dashboard includes:

Connection Overview: Active connections, total connections
Traffic Analysis: Bytes/packets sent and received
Performance Metrics: Packet processing latency, MASQUE request duration
Resource Utilization: IP pool usage, TUN interface status
Error Monitoring: Error rates by type, packet drops
MASQUE Protocol: Request success rates, QUIC stream statistics

Custom Queries

Example queries for custom panels:

Active Connections:

masque_vpn_active_connections

Packet Processing Latency (95th percentile):

histogram_quantile(0.95, rate(masque_vpn_packet_processing_duration_seconds_bucket[5m]))

IP Pool Utilization:

(masque_vpn_ip_pool_used / masque_vpn_ip_pool_total) * 100

Error Rate:

rate(masque_vpn_errors_total[5m])

Metrics Endpoint

The metrics endpoint is available at:

http://localhost:8080/metrics

Testing Metrics

Verify metrics are working:

# Check if metrics endpoint responds
curl http://localhost:8080/metrics

# Check specific metric
curl http://localhost:8080/metrics | grep masque_vpn_active_connections

Sample Metrics Output

# HELP masque_vpn_active_connections Current number of active connections
# TYPE masque_vpn_active_connections gauge
masque_vpn_active_connections 3

# HELP masque_vpn_total_connections Total connections since startup
# TYPE masque_vpn_total_connections counter
masque_vpn_total_connections 15

# HELP masque_vpn_ip_pool_total Total IP addresses in pool
# TYPE masque_vpn_ip_pool_total gauge
masque_vpn_ip_pool_total 254

Alerting

Prometheus Alerting Rules

Create alerts.yml:

groups:
- name: masque-vpn.rules
  rules:
  - alert: MASQUEVPNHighErrorRate
    expr: rate(masque_vpn_errors_total[5m]) > 0.1
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: "High error rate in MASQUE VPN server"
      description: "Error rate is {{ $value }} errors per second"
      
  - alert: MASQUEVPNIPPoolExhaustion
    expr: (masque_vpn_ip_pool_available / masque_vpn_ip_pool_total) < 0.1
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "MASQUE VPN IP pool nearly exhausted"
      description: "Only {{ $value }}% of IP addresses available"
      
  - alert: MASQUEVPNHighLatency
    expr: histogram_quantile(0.95, rate(masque_vpn_packet_processing_duration_seconds_bucket[5m])) > 0.1
    for: 3m
    labels:
      severity: warning
    annotations:
      summary: "High packet processing latency"
      description: "95th percentile latency is {{ $value }}s"

Grafana Alerts

Configure alerts in Grafana:

Go to Alerting → Alert Rules
Create new rule
Set query and conditions
Configure notification channels (Slack, email, etc.)

Troubleshooting

Metrics Not Available

Check if metrics are enabled in server config
Verify server is running and accessible
Check firewall rules for port 8080
Review server logs for errors

Prometheus Connection Issues

Verify target configuration in prometheus.yml
Check Prometheus logs: docker logs prometheus
Verify network connectivity: curl http://server:8080/metrics

Grafana Dashboard Issues

Check data source configuration
Verify Prometheus is collecting metrics
Check query syntax in panels
Review Grafana logs for errors

Performance Considerations

Metrics Collection Impact

Metrics collection has minimal performance impact
Scrape interval of 10-15 seconds is recommended
Avoid very frequent scraping (< 5 seconds)

Storage Requirements

Prometheus storage grows with number of metrics and retention period
Default retention is 15 days
Consider using remote storage for long-term retention

High Availability

For production deployments:

Run multiple Prometheus instances
Use Prometheus federation
Configure Grafana with multiple data sources
Set up alertmanager clustering

Educational Use

For research and educational purposes:

Monitor protocol behavior under different conditions
Analyze performance characteristics
Study the impact of network conditions on VPN performance
Compare metrics before and after configuration changes

See Metrics Reference for detailed metric descriptions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monitoring Setup

Prerequisites

Architecture

Configuration

Enable Metrics in masque-vpn

Prometheus Configuration

Docker Compose Setup

Grafana Dashboard

Import Dashboard

Key Panels

Custom Queries

Metrics Endpoint

Testing Metrics

Sample Metrics Output

Alerting

Prometheus Alerting Rules

Grafana Alerts

Troubleshooting

Metrics Not Available

Prometheus Connection Issues

Grafana Dashboard Issues

Performance Considerations

Metrics Collection Impact

Storage Requirements

High Availability

Educational Use

FilesExpand file tree

setup.md

Latest commit

History

setup.md

File metadata and controls

Monitoring Setup

Prerequisites

Architecture

Configuration

Enable Metrics in masque-vpn

Prometheus Configuration

Docker Compose Setup

Grafana Dashboard

Import Dashboard

Key Panels

Custom Queries

Metrics Endpoint

Testing Metrics

Sample Metrics Output

Alerting

Prometheus Alerting Rules

Grafana Alerts

Troubleshooting

Metrics Not Available

Prometheus Connection Issues

Grafana Dashboard Issues

Performance Considerations

Metrics Collection Impact

Storage Requirements

High Availability

Educational Use