-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instrument RustyVault with Prometheus #76
base: main
Are you sure you want to change the base?
Conversation
use prometheus_client::metrics::counter::Counter; | ||
use prometheus_client::metrics::family::Family; | ||
use prometheus_client::metrics::histogram::{linear_buckets, Histogram}; | ||
use prometheus_client::registry::Registry; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should also be formatted with cargo fmt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatted with cargo fmt
.
src/metrics/system_metrics.rs
Outdated
} | ||
|
||
pub async fn start_collecting(self: Arc<Self>) { | ||
let mut interval = time::interval(Duration::from_secs(5)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the collection interval be set in the configuration file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decision item: Use a single interval for all system metrics?
Since current sysinfo only provides separate refresh for CPU, memory, and process metrics; network and disk metrics cannot be refreshed individually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decision item: Use a single interval for all system metrics?
Since current sysinfo only provides separate refresh for CPU, memory, and process metrics; network and disk metrics cannot be refreshed individually.
Decision: Interval configuration is supported through the configuration file, but currently limited to a single interval.
Method::GET => MetricsMethod::GET, | ||
Method::POST => MetricsMethod::POST, | ||
Method::PUT => MetricsMethod::PUT, | ||
Method::DELETE => MetricsMethod::DELETE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, LIST is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LIST added.
@@ -36,6 +36,8 @@ pub struct Config { | |||
pub daemon_user: String, | |||
#[serde(default)] | |||
pub daemon_group: String, | |||
#[serde(default)] | |||
pub collection_interval: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The collection_interval has no default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default value added through fn default_collection_interval() -> u64
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default value added through fn default_collection_interval() -> u64
.
|
|
Instrument RustyVault with Prometheus
Design
To monitor the system's performance effectively, I applied both the USE and RED methods for metrics collection in RustyVault.
USE Method (Utilization, Saturation, Errors):
Track resource utilization and detect bottlenecks. Metrics related to system resources have been added to ensure the system's health is continuously monitored:
CPU Utilization: Measures the percentage of CPU usage by the RustyVault service.
Memory Utilization: Tracks memory usage, including total, free, and cached memory.
Disk I/O Saturation: Monitors disk read/write speed and detects potential bottlenecks.
Network I/O Saturation: Tracks the amount of data sent and received.
RED Method (Rate, Errors, Duration)
Track the behavior of requests within the application:
Rate: We implemented requests_total to track the rate of requests coming into the system. This allows us to monitor the overall throughput.
Errors: The errors_total counter tracks the number of failed requests and helps monitor the system's error rate.
Duration: Using request_duration_seconds, we measure the time taken to process each request, enabling us to analyze latency and potential performance issues.
Implemented Metrics
<Gauge, AtomicU64>
<Gauge, AtomicU64>
<Gauge, AtomicU64>
<Gauge, AtomicU64>
<Gauge, AtomicU64>
<Gauge, AtomicU64>
<Gauge, AtomicU64>
<Gauge, AtomicU64>
struct HttpLabel {path:String, method:MetricsMethod, status:u16}
Family<HttpLabel, Counter>
Family<HttpLabel, Histogram>
Changes
MetricsManager
inmanager.rs
to store Prometheus Registry, system metrics (system_metrics
), and HTTP API metrics (http_metrics
).metrics_manager
into the server insrc/cli/command/server.rs
by inserting it intoapp_data
.init_metrics_service
in metrics.rs, Sets up the /metrics service by configuring a route in the ServiceConfig.Associates the /metrics route with
metrics_handler
to handle GET requests and respond with Prometheus metrics in text format.System Metrics Collection:
SystemMetrics
struct insystem_metrics.rs
to gather CPU, memory, load, and disk metrics using thesysinfo
crate.collect_metrics
function to collect and store system information.start_collecting
method inserver.block_on
to periodically collect system metrics.HTTP Middleware:
MetricsMiddleware
inmiddleware.rs
as a function middleware to capture HTTP request metrics.src/cli/command/server.rs
to apply the middleware using.wrap(from_fn(metrics_middleware))
.MetricsMethod
enum, trackingGET
,POST
,PUT
,DELETE
, and categorizing others asOTHER
.HTTP Metrics:
HttpMetrics
struct inhttp_metrics.rs
to handle HTTP request counting and duration observation.requests
counter andhistogram
for request durations.increment_request_count
andobserve_duration
for tracking requests and their durations, labeled by HTTP method and path.Testing Steps
curl
to visithttp://localhost:<PORT>/metrics
./login
and/register
.requests_total
andrequest_duration_seconds
increment appropriately.errors_total
increments accordingly./metrics
endpoint to the Prometheus configuration.