tracing
is a set of crates used for logging and tracing of rust
binaries. It is primarily composed of the tracing
crate, which is the
core way to interact with the framework, and the tracing-subscriber
crate,
which provides an API to compose systems that consume tracing data.
The tracing docs say it best:
In asynchronous systems like Tokio, interpreting traditional log messages can often be quite challenging. Since individual tasks are multiplexed on the same thread, associated events and log lines are intermixed making it difficult to trace the logic flow. tracing expands upon logging-style diagnostics by allowing libraries and applications to record structured events with additional information about temporality and causality — unlike a log message, a span in tracing has a beginning and end time, may be entered and exited by the flow of execution, and may exist within a nested tree of similar spans. In addition, tracing spans are structured, with the ability to record typed data as well as textual messages.
The most basic way to use tracing
is to use it like other logging crates like log
,
which additional semi-structured capabilities.
Materialize binaries are setup such that events are printed on stderr.
For example:
tracing::info!("processed {} items", items.len()")
produces:
2022-02-15T18:40:14.289898Z INFO my::module: processed 100 items
(See this page for how the colors will look)
You also also add semi-structured data to the event, for example:
tracing::debug!("processed {} items", items.len(), ?enum_value")
produces:
2022-02-15T18:40:14.289898Z DEBUG my::module: processed 100 items _enum_value_=EnumVariant
because ?enum_value
means record enum
as its Debug
impl. This is referred to as
"field recording", see recording fields for more options and examples.
This just touches the surface of what you can do with the event macros,
see tracing
and tracing-subscriber
for more info!
You may have noticed that log events have level
s. In materialize binaries, we only print
events that are level INFO
or more urgent. You can control this level by using the
log_filter
system variable to control the level at a per-crate, per-module, and
even per-span granularity.
For example:
ALTER SYSTEM SET log_filter='my_interesting_module::sub_module=trace,error';
logs ALL events in the specific module, but only error
events for everything else. See [EnvFilter
] for more examples.
Like all system variables, it is also possible to set a log filter default when initially invoking environmentd
:
./bin/environmentd -- --system-parameter-default=log_filter=my_interesting_module::sub_module=trace,error
Note that this value is passed through to all compute and storage instances when running locally. In the future, it is possible to add per-service filtering, if deemed useful.
Please see the tracing docs for more information about spans
Span
s are what makes tracing
a "tracing" crate, and not just a logging crate. Spans are like events,
but they have a start and and end, and are organized into a tree of parents and children. There are
many tools that can analyze and visualize spans to help you determine where Materialize is spending
its time.
There are macros like the event macros for creating them:
let span tracing::info_span!("my_span", ?field);
Critically, these spans do very nothing till they are entered, which turns on the span. When a span is entered, spans that are created on the same thread inherit the entered span as a parent. For example, in:
let s1 = tracing::debug_span!("my_span").entered();
let s2 = tracing::debug_span!("my_span2");
the s2
span as s1
as a parent.
NOTE: entered()
and enter()
do not work with async code! See the next 2 sections
for a better way to do this!
Previous examples may feel heavyweight to operate, but tracing
provides the tracing::instrument
attribute macro
to make it easier. This macro is applied to functions (or the implementations of trait methods) (it works with async
functions as well, and #[async_trait]
), and automatically creates and enters a span each time a function is called.
This makes it easy to make a span tree based on your call graph.
An example:
#[tracing::instrument]
fn my_sub_function(field_that_will_be_recorded: String) { ... }
#[tracing::instrument(skip(big_argument))]
fn my_function(big_argument: MyState) {
...
my_sub_function(big_argument.thing);
...
my_sub_function(big_argument.two);
}
Both the spans for the my_sub_function
calls will be childen of the my_function
span!
See tracing::instrument
for more info.
Please see here for more advice on how to use spans with async code
While #[tracing::instrument]
works fine with async
code. Span::enter
can be problematic, explained in the link above.
If #[tracing::instrument]
is not an option, then import tracing::Instrument
and call async functions like this:
let s = info_span!("important_span");
let ret = thing.call_async_method().instrument(s).await;
By default in materialize binaries, spans are NOT logged to stderr. However, their fields ARE inherited
let e = tracing::debug_span!("my_span", ?enum_value").entered();
tracing::debug!("processed {} items", items.len());
produces:
2022-02-15T18:40:14.289898Z DEBUG my::module: processed 100 items _enum_value_=EnumVariant
While events are clearly useful for logging, so far in this document, its unclear what spans offer. While there are many crates that do interesting things with tracing information, the primary way we use spans in materialize binaries is to collect OpenTelemetry data. In Materialize binaries, tracing spans can be automatically collected and exported to analysis tools with no additional work required at callsites
- For local setup, use the instructions here: https://github.com/MaterializeInc/materialize/tree/main/misc/monitoring to setup a local OpenTelemetry collector (Tempo) and UI (Grafana) to view traces.
OpenTelemetry UI's like the Grafana/Tempo one in the local setup are extraordinarly powerful tools for debugging. They allow you to visualize and inspect the exact control flow graph of your service, including the latency of each operation, and contextual recorded fields. See Best Practices for more information.
OpenTelemetry allows us to associate tracing
spans from distributed services with each other, allowing us to not only visualize
control-flow within a single service, but across services that communicate with each other.
See (TODO(guswynn): fill this in) for a demo on how this works.
TODO(guswynn): clean this up when https://github.com/MaterializeInc/materialize/issues/13019 is done.
This is important to ensure that complex interactions between services can be debugged.
Materialize has lots of queues, both in-process (typically tokio::sync::mpsc
channels), and cross-process (grpc).
If you want to trace work being enqueued and then processed when dequeued:
Prefer moving a tracing::Span
through the queue, like so:
#[tracing::instrument(skip(self))]
fn create_sources(&mut self, cs: CreateSources) {
self.commands_tx.send(
InstrumentedCreateSources {
cs,
span: tracing::Span::current(),
}
)
}
async fn process_work(&mut self) {
loop {
match self.commands_rx.recv().await {
InstrumentedCreateSources { cs, span } => {
// Note that this will typically be done in the most general way possible for
// Command/Response patterns
let create_sources_span = tracing::debug_span!(parent: &span, "process_create_sources");
self.process_create_sources(cs).instrument(create_sources_span).await;
}
_ => ...
}
}
}
This will improve how the trace visualization will look (the sender span will be the parent of the child and wont close till the child is done), and will reduce load on collectors.
Use mz_ore::tracing::OpenTelemetryContext::obtain
to pass a context from the sending service to the receiving service,
and mz_ore::tracing::OpenTelemetryContext::attach_as_parent()
on the receiving side. See the docs for more information.
ERROR
:- events that needs to be triaged, or something that is fatal to the binary starting correctly. Note that error events usually show up red on trace visualizations.
WARN
:- events that could represent future failures, or non-standard situations. Note that error events usually show up yellow on trace visualizations.
INFO
:- events with information used commonly enough that it should regularly show up in the stderr log.
DEBUG
:- most spans that track the flow of commands and requests across services.
- events with information used to debug sub-components.
TRACE
:- exceedingly verbose information intended only for local debugging/tracing
To setup trace collection locally, start the ../../misc/monitoring
composition. It will spin up Tempo to
automatically start storing traces, and Grafana to visualize them.
Then start ./bin/environmentd --monitoring
.
By default, ./bin/environmentd
will only emit INFO
-level spans. The filter is controlled through the
opentelemetry_filter
system variable and can be toggled dynamically:
> psql -U mz_system -h localhost -p 6877
mz_system=> ALTER SYSTEM SET opentelemetry_filter="debug";
Or on startup:
./bin/environmentd --reset --monitoring -- --system-parameter-default='opentelemetry_filter=debug'
More details on the filter syntax can be found in Notion.
It's often valuable to see the traces associated with specific queries. This can be done by setting the session variable:
SET emit_trace_id_notice = true;
Then each subsequent query will emit a NOTICE containing the trace ID for its execution:
materialize=> SELECT 1;
NOTICE: trace id: 65e87c160063307a9d1221f78ae55cf8
This trace ID can then be plugged into Grafana's TraceQL lookup for an exact match: