Aperture v0.8.0
Changelog
List of aperture PRs merged since 0.7.0 release. For the full list of changes, see list of changes
Revamp workload and flux meter metrics and labels (#843)
Description of change
- New label
attribute_found
in FluxMeter to denote if the attribute on
which the flux meter is based was found in the access log/span - Removed label
decision_type
on summaryworkload_latency_ms
since
it is now emitted only if response was received. - New counter
workload_requests_total
to measure the workload
decisions count since the summary does not take into account the
scenarios where response is not received e.g. rejects or connection
resets. - A new column
response_received
on OLAP Flow events to denote the
case when response is not received.
Ignore negative workload latency (#839)
Issue
- Workload latency in case of Envoy is calculated as:
workload_latency = response_latency - aperture_latency
- Workload Latency can become negative in case of connection reset
- If the connection is aborted by Client or Server Envoy immediately
terminates the connection for the other endpoint. - In the Access Log, status code is set as 0 and
response_latency
is
set as zero. - If Authz call to Aperture Agent had succeeded for this request, then
aperture_latency is greater than zero.
Fix
- Ignore negative workload latency I.E. don't populate the workload
latency column - Publish Prometheus metrics for flux-meter or workload latency only if
the metric column is found
TickInfo in LoadDecision (#836)
Description of change
- Put
TickInfo
in LoadDecision` to re-trigger fill-rate evaluation at
Agent.
Re-structure protos (#831)
Fix telemetry labels propagation (#835)
Description of change
This fixes regression introduced in
#828.
Dynamic Telemetry Flow Labels were added before labels filtering, which
led them to be incorrectly filtered out.
Fix telemetry labels propagation (#835)
Description of change
This fixes regression introduced in
#828.
Dynamic Telemetry Flow Labels were added before labels filtering, which
led them to be incorrectly filtered out.
Bump OTEL to 0.63.0 (#834)
Description of change
Bumps OTEL and FN OTEL to 0.63.0. This removes Istio 1.15 compat hack as
it is included in the upstream OTEL.
Response status in telemetry (#828)
Description of change
This introduces aperture.response_status
column in telemetry. It
mirrors the implementation of response_status
label for metrics.
This also extends above logic to include 1xx
, 2xx
, and 3xx
codes
as OK instead of only 2xx
codes.
Besides this, some cleanup is done:
- Above logic is moved from
FluxMeter
to OTEL package. This changes
FluxMeter interface! - A log of logic is moved from
metricsprocessor
to
metricsprocessor/internal
for better visibility and easier separation
of functions which are called directly in metricsprocessor and helpers, - The above made creating UT much easier, so this PR also includes
some.
Ref: fluxninja/cloud#6788
Dry run mode for Load Actuator (#826)
Description of change
- Dry run mode for Load Actuator. No traffic can get dropped due to this
Load Actuator in this mode. Useful for observing the behavior of Load
Actuator without any disruptions. - Load Actuator has a new Pass through mode
- Default to Pass through mode in case multiplier is invalid and also
when there is no decision available at the Agent including
initialization
Rollup based on metrics (#821)
Closes: GH-515
docs: playground doc updates (#819)
Description of change
- Moved demo_app to playground
- Added more details to playground documentation
- Bump istio and other tools