Skip to content

Detect wasteful Kubernetes autoscaling: analyze CPU vs pod metrics to optimize on-prem cluster costs and performance.

Notifications You must be signed in to change notification settings

lazerion/scale-sniff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scale-sniff

Detect wasteful or under-provisioned Kubernetes autoscaling in on-prem environments

scale-sniff analyzes CPU utilization metrics against Pod Counts over time to identify inefficient scaling patterns in on-prem Kubernetes clusters. Pinpoint over-provisioned deployments, under-utilized resources, and misconfigured HPA policies to optimize costs and performance.

Why Use scale-sniff?

  • Reduce costs by identifying over-provisioned resources
  • Improve performance by detecting under-provisioned workloads
  • Optimize HPA configurations with data-driven insights
  • On-prem focused – designed for environments without cloud auto-scaling benefits

Contributions welcome! Please open a GitHub issue for:

  • 🐛 Bug reports
  • 💡 Feature requests
  • ❓ Questions & discussions
  • 📝 General feedback

🔍 How It Works

  • Discovers Services → finds corresponding Deployments via pod label selectors
  • Maps App to Pods
    • Service → selects Pods via spec.selector
    • Pods ← owned by Deployment ← scaled by HPA
  • Fetches Metrics from Prometheus:
    • CPU: from cAdvisor (built into kubelet)
    • Pod Count: from kube-state-metrics
  • Analyzes Efficiency:
    • Compares actual CPU to HPA target with google/gemma-2-2b-it model
    • Flags over-provisioning (high pods, low CPU) or under-provisioning (high CPU, low pods)

📊 Requirements

Component Purpose
kube-state-metrics Exposes kube_pod_status_ready for pod count
cAdvisor (default in kubelet) Provides container_cpu_usage_seconds_total for CPU usage
Prometheus Scrapes and stores metrics
HPA Enables autoscaling analysis
HuggingFace token Analyze step requires HuggingFace token
Visit Hugging Face Token Documentation for more details.

🚀 Quick Start

go build -o scale-sniff ./cmd/cli

# help
Usage: ./scale-sniff [options]

Options:
  -config-path string
        Configuration file path (config.yaml should be under that!) (default "./internal/config")
  -duration string
        Analysis window (default "24h")
  -h    Show help
  -help
        Show help
  -hf-model string
        HuggingFace model override (default "google/gemma-2-2b-it")
  -hf-token string
        HuggingFace API token
  -hf-url string
        HuggingFace API URL (default "https://router.huggingface.co/v1/chat/completions")
  -namespace string
        Kubernetes namespace (default "default")
  -prometheus-base-url string
        Prometheus base URL
  -range-vector string
        Rate range vector (default "5m")
  -step string
        Sample resolution (default "5m")

# Analyze a namespace
./scale-sniff analyze --namespace prod

# Custom time window
./scale-sniff analyze --namespace dev --duration 15m --step 30s

🎬 Demo Setup

Prerequisites

k3d (or any local Kubernetes cluster)
kubectl
go (for building the CLI)

Apply in order under demo directory:

Create local cluster (optional if using existing cluster)
k3d cluster create demo --agents 1

kubectl apply -f nginx-deployment.yaml

kubectl apply -f prometheus-rbac.yaml

kubectl apply -f prometheus-config.yaml

kubectl apply -f prometheus-deployment.yaml

kubectl apply -f ksm-rbac.yaml   

kubectl apply -f kube-state-metrics.yaml

kubectl get pods
kubectl get services

# Or use validation script 
chmod +x validate.sh
./validate.sh

Create HPA

kubectl autoscale deployment nginx-demo --cpu-percent=10 --min=1 --max=10

Forward prometheus port

 kubectl port-forward service/prometheus-service 9090:9090

Export HuggingFace Token

 export HUGGINGFACE_TOKEN=<your-token>

Run scale-sniff

Run scale-sniff under project directory

 ./scale-sniff 
⏳ nginx-service: fetching
⏳ prometheus-service: fetching
⏳ nginx-service: analyzing

📊 Final Report:
❓ prometheus-service: Inconclusive — No HPA configured — autoscaling not enabled
✅ nginx-service: Efficient

clean up

k3d cluster delete demo

About

Detect wasteful Kubernetes autoscaling: analyze CPU vs pod metrics to optimize on-prem cluster costs and performance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages