-
Notifications
You must be signed in to change notification settings - Fork 19
Useful metrics #106
Comments
For DB connectivity we have spans of the time taken by sql persistence operations (by operation). These are then collected in histogram by the prometheus mapper. I would argue that we also need quantiles (therefore use prometheus Summaries instead of Histograms), so I can add those. Counters (number of connections, number of fn invocations, ...) are supported in prometheus but I'm not sure if opentracing has a concept for those (still, I'm a bit ignorant when it comes to opentracing). |
I don't mind using raw prometheus (or something wrapped around it) if it means we can get counters out for useful things. |
I don't think opentracing concerns itself with metrics/gauge stuff - and retconning numbers from the events is a bad idea, I assume we'll need to generate propmetheus metrics from internal gauge/counters alongside the event metrics. |
Note: #114 adds a few of the mentioned metrics.
|
Operationally, there are some obvious things to measure per flow node. These should be exposed via /metrics if they aren't already:
DB connectivity:
One upper limit on how many concurrent stage operations we can sustain per second is
(max pool connections) / <sql query span>
.Executor connectivity:
Error counts:
The text was updated successfully, but these errors were encountered: