Description
Component(s)
exporter/prometheus
What happened?
Description
When the collector receives a NOHUP
signal, it reloads the current configuration and calls the Shutdown
method. The Shutdown method of the prometheusexporter however does only close the listener, not the http.Server
. This means that the old server does not accept any new connections, but all existing connections may continue to live forever. If a client (like Prometheus) keeps the connection between scrapes alive, the requests will be served by the old instance, meaning the metrics will no longer get any updates until they disappear. This is especially difficult to debug, since requests using curl
or wget
will be served by the new instance, seeing completely different metrics.
Steps to Reproduce
- Setup otel-collector with the prometheus exporter
- Setup prometheus to scrape this instance every 60s
- Send a NOHUP signal to prometheus
- No metric updates will be recorded by prometheus after the NOHUP
Expected Result
- The old server must be shutdown, so that prometheus reconnects and receives the latest metrics
Actual Result
- The old HTTP connection continues to live on and metrics go stale in prometheus
I will provide a Pull Request for this issue
Collector version
v0.109.0/v0.110.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
OpenTelemetry Collector configuration
No response
Log output
No response
Additional context
No response