Skip to content

Commit b6260fa

Browse files
committed
update readme
1 parent 348f659 commit b6260fa

File tree

1 file changed

+59
-0
lines changed

1 file changed

+59
-0
lines changed

README.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,65 @@ Pyro Caml is a profiler for OCaml that works with
33
[Pyroscope](https://pyroscope.io/) for statistical continuous profiling purely
44
in user space.
55

6+
# How it works
7+
## Architecture
8+
Pyro Caml work by generating samples consisting of OCaml callstacks within the
9+
instrumented program. These samples are then written to a ring buffer via the
10+
[OCaml Runtime Events tracing
11+
system](https://ocaml.org/manual/5.3/runtime-tracing.html) introduce in OCaml 5.
12+
Finally the Pyro Caml program, which is written in Rust in order to utilize
13+
[pyroscope-rs](https://github.com/grafana/pyroscope-rs), uses ocaml-rs to read
14+
from this ring buffer, process the callstacks, and then send the resulting
15+
profile with metadata to a Pyroscope instance.
16+
17+
## Collecting and processing samples
18+
Pyro Caml generates samples one of two ways, either via
19+
[Memprof](https://ocaml.org/manual/5.3/api/Gc.Memprof.html) or by explicitly
20+
emitting a sample. Memprof passes a callstack as a callback argument for any
21+
allocation it samples, which is what's used to produce a sample in that case.
22+
These callstacks get combined with metadata indicating when the sample was
23+
taken, and which domain it was generated from, to form a sample. If the
24+
resulting sample is too large to fit in a single runtime event, as there is a
25+
1024 byte payload limit, they will be conditionally broken up into smaller
26+
parts.
27+
28+
On the Pyro Caml collector side, at regular intervals indicated by the sample
29+
rate will read these samples. If it receives any sample parts it will recombine
30+
them into a whole sample. We then choose a single sample from each domain, in
31+
order to form a complete picture of the instrumented program's callstack at a
32+
single moment in time.
33+
34+
Notably, we are reading samples that may not all occur at a single moment in
35+
time. This means we cannot use the samples as is, as Memprof samples will be
36+
weighted towards where the program allocates most, and manually emitted samples
37+
will be weighted towards where the sample is emitted. To deal with this, we try
38+
to generate as many samples as we can without introducing significant overhead.
39+
This means for a given sample interval, we have many possible samples to choose
40+
from, which allows us to choose a sample timestamped sufficiently close to the
41+
single point in time we want to generate a complete callstack for. The downside
42+
here is that for programs that don't emit many samples, we lose accuracy for
43+
function calls that last less than the time of the sample interval.
44+
45+
Consider this example program, that we are sampling at a rate of 100 times a
46+
second (the default for Pyro Caml):
47+
48+
![example program](./images/d1.png)
49+
50+
Say we're sampling at time `t+10`. No matter where we generate a sample in for
51+
the sample interval `[(t+0),(t+10)]` will include `func_a` in the callstack, as
52+
the duration of `func_a` is greater than the sample interval. Assuming `func_c`
53+
allocates sufficient memory and the Memprof sample rate is high enough, or it
54+
explicitly emits a sample, it will generate a sample, meaning that this function
55+
will also be included in the callstack. This means that although `func_c`'s
56+
duration is shorter than `func_b`, we have generated a callstack that is
57+
identical to a snapshot of the callstack at a single instant in time.
58+
59+
If `func_c` did not generate a sample, and `func_b` did, our callstack differs
60+
from the callstack at a single instant in time, resulting in a less accurate
61+
sample. Formal testing still needs to be done, but we've found that most OCaml
62+
programs allocate enough that this case rarely happens, and functions that
63+
rarely allocate can be modified to explicitly emit samples.
64+
665
# How to use
766
Pyro Caml consists of three parts, the instrumentation library, the profiler,
867
and a helpful PPX. The instrumentation and PPX libraries has no dependencies on

0 commit comments

Comments
 (0)