@@ -3,6 +3,65 @@ Pyro Caml is a profiler for OCaml that works with
33[ Pyroscope] ( https://pyroscope.io/ ) for statistical continuous profiling purely
44in user space.
55
6+ # How it works
7+ ## Architecture
8+ Pyro Caml work by generating samples consisting of OCaml callstacks within the
9+ instrumented program. These samples are then written to a ring buffer via the
10+ [ OCaml Runtime Events tracing
11+ system] ( https://ocaml.org/manual/5.3/runtime-tracing.html ) introduce in OCaml 5.
12+ Finally the Pyro Caml program, which is written in Rust in order to utilize
13+ [ pyroscope-rs] ( https://github.com/grafana/pyroscope-rs ) , uses ocaml-rs to read
14+ from this ring buffer, process the callstacks, and then send the resulting
15+ profile with metadata to a Pyroscope instance.
16+
17+ ## Collecting and processing samples
18+ Pyro Caml generates samples one of two ways, either via
19+ [ Memprof] ( https://ocaml.org/manual/5.3/api/Gc.Memprof.html ) or by explicitly
20+ emitting a sample. Memprof passes a callstack as a callback argument for any
21+ allocation it samples, which is what's used to produce a sample in that case.
22+ These callstacks get combined with metadata indicating when the sample was
23+ taken, and which domain it was generated from, to form a sample. If the
24+ resulting sample is too large to fit in a single runtime event, as there is a
25+ 1024 byte payload limit, they will be conditionally broken up into smaller
26+ parts.
27+
28+ On the Pyro Caml collector side, at regular intervals indicated by the sample
29+ rate will read these samples. If it receives any sample parts it will recombine
30+ them into a whole sample. We then choose a single sample from each domain, in
31+ order to form a complete picture of the instrumented program's callstack at a
32+ single moment in time.
33+
34+ Notably, we are reading samples that may not all occur at a single moment in
35+ time. This means we cannot use the samples as is, as Memprof samples will be
36+ weighted towards where the program allocates most, and manually emitted samples
37+ will be weighted towards where the sample is emitted. To deal with this, we try
38+ to generate as many samples as we can without introducing significant overhead.
39+ This means for a given sample interval, we have many possible samples to choose
40+ from, which allows us to choose a sample timestamped sufficiently close to the
41+ single point in time we want to generate a complete callstack for. The downside
42+ here is that for programs that don't emit many samples, we lose accuracy for
43+ function calls that last less than the time of the sample interval.
44+
45+ Consider this example program, that we are sampling at a rate of 100 times a
46+ second (the default for Pyro Caml):
47+
48+ ![ example program] ( ./images/d1.png )
49+
50+ Say we're sampling at time ` t+10 ` . No matter where we generate a sample in for
51+ the sample interval ` [(t+0),(t+10)] ` will include ` func_a ` in the callstack, as
52+ the duration of ` func_a ` is greater than the sample interval. Assuming ` func_c `
53+ allocates sufficient memory and the Memprof sample rate is high enough, or it
54+ explicitly emits a sample, it will generate a sample, meaning that this function
55+ will also be included in the callstack. This means that although ` func_c ` 's
56+ duration is shorter than ` func_b ` , we have generated a callstack that is
57+ identical to a snapshot of the callstack at a single instant in time.
58+
59+ If ` func_c ` did not generate a sample, and ` func_b ` did, our callstack differs
60+ from the callstack at a single instant in time, resulting in a less accurate
61+ sample. Formal testing still needs to be done, but we've found that most OCaml
62+ programs allocate enough that this case rarely happens, and functions that
63+ rarely allocate can be modified to explicitly emit samples.
64+
665# How to use
766Pyro Caml consists of three parts, the instrumentation library, the profiler,
867and a helpful PPX. The instrumentation and PPX libraries has no dependencies on
0 commit comments