A prometheus ephemeral storage metric exporter for pods, containers, nodes, and volumes.
This project was created to address lack of monitoring in Kubernetes
This project does not monitor CSI backed ephemeral storage ex. Generic ephemeral volumes
helm repo add k8s-ephemeral-storage-metrics https://jmcgrath207.github.io/k8s-ephemeral-storage-metrics/chart
helm repo update
helm upgrade --install my-deployment k8s-ephemeral-storage-metrics/k8s-ephemeral-storage-metrics
Key | Type | Default | Description |
---|---|---|---|
affinity | object | {} |
|
containerSecurityContext.allowPrivilegeEscalation | bool | false |
|
containerSecurityContext.capabilities.drop[0] | string | "ALL" |
|
containerSecurityContext.privileged | bool | false |
|
containerSecurityContext.readOnlyRootFilesystem | bool | false |
|
containerSecurityContext.runAsNonRoot | bool | true |
|
deploy_type | string | "Deployment" |
Set as Deployment for single controller to query all nodes or Daemonset |
dev | object | {"enabled":false,"grow":{"image":"ghcr.io/jmcgrath207/k8s-ephemeral-storage-grow-test:latest","imagePullPolicy":"IfNotPresent"},"shrink":{"image":"ghcr.io/jmcgrath207/k8s-ephemeral-storage-shrink-test:latest","imagePullPolicy":"IfNotPresent"}} |
For local development or testing that will deploy grow and shrink pods and debug service |
image.imagePullPolicy | string | "IfNotPresent" |
|
image.imagePullSecrets | list | [] |
|
image.repository | string | "ghcr.io/jmcgrath207/k8s-ephemeral-storage-metrics" |
|
image.tag | string | "1.14.0" |
|
interval | int | 15 |
Polling node rate for exporter |
kubelet | object | {"insecure":false,"readOnlyPort":0,"scrape":false} |
Scrape metrics through kubelet instead of kube api |
log_level | string | "info" |
|
max_node_concurrency | int | 10 |
Max number of concurrent query requests to the kubernetes API. |
metrics | object | {"adjusted_polling_rate":false,"ephemeral_storage_container_limit_percentage":true,"ephemeral_storage_container_volume_limit_percentage":true,"ephemeral_storage_container_volume_usage":true,"ephemeral_storage_node_available":true,"ephemeral_storage_node_capacity":true,"ephemeral_storage_node_percentage":true,"ephemeral_storage_pod_usage":true,"port":9100} |
Set metrics you want to enable |
metrics.adjusted_polling_rate | bool | false |
Create the ephemeral_storage_adjusted_polling_rate metrics to report Adjusted Poll Rate in milliseconds. Typically used for testing. |
metrics.ephemeral_storage_container_limit_percentage | bool | true |
Percentage of ephemeral storage used by a container in a pod |
metrics.ephemeral_storage_container_volume_limit_percentage | bool | true |
Percentage of ephemeral storage used by a container's volume in a pod |
metrics.ephemeral_storage_container_volume_usage | bool | true |
Current ephemeral storage used by a container's volume in a pod |
metrics.ephemeral_storage_node_available | bool | true |
Available ephemeral storage for a node |
metrics.ephemeral_storage_node_capacity | bool | true |
Capacity of ephemeral storage for a node |
metrics.ephemeral_storage_node_percentage | bool | true |
Percentage of ephemeral storage used on a node |
metrics.ephemeral_storage_pod_usage | bool | true |
Current ephemeral byte usage of pod |
metrics.port | int | 9100 |
Adjust the metric port as needed (default 9100) |
nodeSelector | object | {} |
|
podAnnotations | object | {} |
|
podSecurityContext.runAsNonRoot | bool | true |
|
podSecurityContext.seccompProfile.type | string | "RuntimeDefault" |
|
pprof | bool | false |
Enable Pprof |
prometheus.enable | bool | true |
|
prometheus.release | string | "kube-prometheus-stack" |
|
prometheus.rules.enable | bool | false |
Create PrometheusRules firing alerts when out of ephemeral storage |
prometheus.rules.labels | object | {"severity":"warning"} |
What additional labels to set on alerts |
prometheus.rules.predictFilledHours | int | 12 |
How many hours in the future to predict filling up of a volume |
serviceMonitor | object | {"additionalLabels":{},"enable":true,"metricRelabelings":[],"podTargetLabels":[],"relabelings":[],"targetLabels":[]} |
Configure the Service Monitor |
serviceMonitor.additionalLabels | object | {} |
Add labels to the ServiceMonitor.Spec |
serviceMonitor.metricRelabelings | list | [] |
Set metricRelabelings as per https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#monitoring.coreos.com/v1.RelabelConfig |
serviceMonitor.podTargetLabels | list | [] |
Set podTargetLabels as per https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#monitoring.coreos.com/v1.ServiceMonitorSpec |
serviceMonitor.relabelings | list | [] |
Set relabelings as per https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#monitoring.coreos.com/v1.RelabelConfig |
serviceMonitor.targetLabels | list | [] |
Set targetLabels as per https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#monitoring.coreos.com/v1.ServiceMonitorSpec |
tolerations | list | [] |
To prevent from multiple kind of alerts being fired for a single container or
emptyDir volume when both prometheus.enable
and prometheus.rules.enable
are
on, add the following inhibition
rules
to your Alert Manager config:
- source_matchers:
- alertname="EphemeralStorageVolumeFilledUp"
target_matchers:
- severity="warning"
- alertname="EphemeralStorageVolumeFillingUp"
equal:
- pod_namespace
- pod_name
- volume_name
- source_matchers:
- alertname="ContainerEphemeralStorageUsageAtLimit"
target_matchers:
- severity="warning"
- alertname="ContainerEphemeralStorageUsageReachingLimit"
equal:
- pod_namespace
- pod_name
- exported_container
make new_minikube
make deploy_local
make deploy_debug
Then connect to localhost:30002
with delve or your IDE.
make deploy_e2e
make deploy_e2e_debug
Then run a debug against deployment_test.go
This project is licensed under the MIT License. See the LICENSE
file for more details.