feat(observability): add component-probes feature for bpftrace component-level CPU attribution#24860
feat(observability): add component-probes feature for bpftrace component-level CPU attribution#24860connoryy wants to merge 19 commits intovectordotdev:masterfrom
Conversation
…attribution Adds an opt-in Cargo feature that lets external bpftrace scripts attribute CPU time to individual Vector components. A 4 KiB shared-memory array (VECTOR_COMPONENT_LABELS) is indexed by tid % 4096. On span enter the component's group-id is written; on span exit it is cleared. Both operations are single Relaxed atomic byte stores (~2 ns, no syscall). A separate uprobe symbol (vector_register_component) fires once per component at startup so bpftrace can build the id → name mapping and resolve the array's runtime address. Disabled by default. Runtime code is gated to Linux.
…attribution Adds an opt-in Cargo feature that lets external bpftrace scripts attribute CPU time to individual Vector components. A 4 KiB shared-memory array (VECTOR_COMPONENT_LABELS) is indexed by tid % 4096. On span enter the component's group-id is written; on span exit it is cleared. Both operations are single Relaxed atomic byte stores (~2 ns, no syscall). A separate uprobe symbol (vector_register_component) fires once per component at startup so bpftrace can build the id → name mapping and resolve the array's runtime address. Disabled by default. Runtime code is gated to Linux.
…ing to be maintained in bpftrace
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
thomasqueirozb
left a comment
There was a problem hiding this comment.
Thanks for this! This looks like a nice addition. Can you add a guide to website/content/en/guides/advanced/? This would make it possible for us to review and test this more easily and also have this documented to users that want to use this feature
|
|
||
| Replace `/path/to/vector` with your binary path: | ||
|
|
||
| ```bpf |
Check failure
Code scanning / check-spelling
Unrecognized Spelling
| if ($addr != 0) { | ||
| $group_id = *(uint32 *)$addr; | ||
| if ($group_id != 0) { | ||
| @stacks[@names[$group_id], ustack()] = count(); |
Check failure
Code scanning / check-spelling
Unrecognized Spelling
| ``` | ||
|
|
||
| This aggregates component-labeled stack traces directly in bpftrace. Start | ||
| bpftrace before Vector so it catches the registration uprobes during startup. |
Check failure
Code scanning / check-spelling
Unrecognized Spelling
| This aggregates component-labeled stack traces directly in bpftrace. Start | ||
| bpftrace before Vector so it catches the registration uprobes during startup. | ||
|
|
||
| If `ustack()` is not available in your environment, replace the `@stacks` |
Check failure
Code scanning / check-spelling
Unrecognized Spelling
| line with a `printf` to emit raw labeled samples that can be joined with | ||
| stack traces from other tools like `perf`: | ||
|
|
||
| ```bpf |
Check failure
Code scanning / check-spelling
Unrecognized Spelling
Summary
Adds an opt-in component-probes Cargo feature that enables external bpftrace scripts to attribute CPU samples to individual Vector components by ID.
Vector configuration
How did you test this PR?
Unit tests for the component_probes module (label stability, uniqueness across threads, store/clear cycle). Tested end-to-end locally with a bpftrace script that attaches to the two uprobes (vector_register_thread, vector_register_component), builds the tid -> address and group_id -> name maps, and samples at 997 Hz. I can also provide a reference .bt script if requested.
Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
Relevant issue: #24851
Notes
@vectordotdev/vectorto reach out to us regarding this PR.pre-pushhook, please see this template.make fmtmake check-clippy(if there are failures it's possible some of them can be fixed withmake clippy-fix)make testgit merge origin masterandgit push.Cargo.lock), pleaserun
make build-licensesto regenerate the license inventory and commit the changes (if any). More details here.