Detect system info + available BPF features on startup #70

patnebe · 2024-09-17T06:00:12Z

Context

Early detection of system info and BPF features to ensure the required runtime dependencies are met.
This information will also provide a better UX through improved error reporting on startup.

Changes

Add a new BPF program to detect what BPF features are available on the host.
Check that procfs and tracefs are mounted.
Check support for tracepoints, perf_events, and a few bpf helpers.

Test Plan

Manual tests + Unit(? / integration?) tests.

…to feature-detection

javierhonduco

Looks very good overall!! Great progress so far!

We might want to look into the kernel test errors, and another thing we could do was moving all this code that is pretty much self-contained (and potentially) reusable to a top level crate similarly to how lightswitch-proto is implemented. What do you think?

javierhonduco · 2024-09-23T14:20:19Z

src/system_info.rs

+}
+
+fn tracefs_mount_detected() -> bool {
+    return Path::new(PROCFS_PATH).exists();


nice! very clean

javierhonduco · 2024-09-23T14:21:16Z

src/system_info.rs

+
+fn get_trace_sched_event_id(trace_event: &str) -> Result<u32> {
+    if !tracefs_mount_detected() {
+        return Err(anyhow!("Failed to detect tracefs"));


We should probably have custom errors. I appreciate that currently the project abuses anyhow, and we should change this, but let's try to make this errors into their own variants of an enum. Happy to send some hints if you need them!

Fair point. Added a few custom errors to this file

javierhonduco · 2024-09-23T14:22:36Z

src/system_info.rs

+                err
+            )),
+        },
+        Err(_) => Err(anyhow!("Failed to read event={} id", trace_event)),


Using or_else + ? might simply the code

javierhonduco · 2024-09-23T14:23:43Z

src/system_info.rs

+}
+
+fn tracepoints_detected() -> bool {
+    let mut tracepoints_supported = true;


style nit: not need for this variable, feel free to just return either true or false directly

javierhonduco · 2024-09-23T14:24:49Z

src/system_info.rs

+        return tracepoints_supported;
+    }
+
+    if unsafe { close(fd) } != 0 {


We can look into wrapping this into a type that on drop() attempts to close this. The Drop trait is akin to RAII in C++

- use custom errors - use drop trait - update match to map_err + ?

…to feature-detection

patnebe · 2024-09-26T05:58:16Z

We might want to look into the kernel test errors,

FWIW, the error seemed to be related to the unavailability of the hrtimer_start_range_ns kprobe (or maybe kprobes in general on the VM?). Perhaps some kernel flag is missing(?). So instead of spending time figuring out how to get that to work, I decided to hook the BPF feature detection program to a tracepoint instead. The main reason is that we already rely on tracepoints and currently don't use kprobes anywhere else.

another thing we could do was moving all this code that is pretty much self-contained (and potentially) reusable to a top level crate similarly to how lightswitch-proto is implemented. What do you think?

That sounds reasonable. Maybe lightswitch_sys_probe would be a good name for this crate? Open to suggestions :)

Signed-off-by: Okwudili Pat-Nebe <cp.nebe@gmail.com>

…to feature-detection

Signed-off-by: Okwudili Pat-Nebe <cp.nebe@gmail.com>

javierhonduco · 2024-09-26T14:08:16Z

lightswitch-sys-probe/src/system_info.rs

+            && self.software_perfevents_support_detected
+            && self.tracepoints_support_detected
+            && bpf_features.can_load_trivial_bpf_program
+            && bpf_features.has_ring_buf


We don't use ring buffers now, and when we do, most likely it will be dynamically chosen depending on the availability of ring buffers

Right. I've relaxed the ringbuf requirement.

javierhonduco · 2024-09-26T14:08:22Z

lightswitch-sys-probe/src/system_info.rs

+        })
+    }
+
+    pub fn has_minimal_requirements(&self) -> bool {


javierhonduco · 2024-09-26T14:10:11Z

lightswitch-sys-probe/src/bpf/vmlinux.h

@@ -0,0 +1,7 @@
+#ifdef __TARGET_ARCH_x86
+#include "vmlinux_x86.h"


Given how large vmlinux is and that we might want to keep just on copy, what do you think of using either symlinks or including the file from the other crate directly here?

Done. Changed this to a symlink

javierhonduco · 2024-09-26T18:17:22Z

FWIW, the error seemed to be related to the unavailability of the hrtimer_start_range_ns kprobe (or maybe kprobes in general on the VM?). Perhaps some kernel flag is missing(?). So instead of spending time figuring out how to get that to work, I decided to hook the BPF feature detection program to a tracepoint instead. The main reason is that we already rely on tracepoints and currently don't use kprobes anywhere else.

Makes sense!

That sounds reasonable. Maybe lightswitch_sys_probe would be a good name for this crate? Open to suggestions :)

What about lightswitch-features-probing / lightswitch-capabilities or something like this? -sys is typically used for C bindings

…to feature-detection

javierhonduco · 2024-10-01T12:32:40Z

lightswitch-capabilities/src/system_info.rs

+    }
+}
+
+// TODO: How can we make this an integration/system test?


It's totally fine to leave it here. Personally I don't think the difference between purely no-IO unittests and integration tests matters that much, unless we need to test a binary, in that case we should use the tool level folder instead. The standard in the Rust world is to add integration tests in test/ so here would go on lightswitch-capabilities/test.

javierhonduco

Great job! LGTM

patnebe and others added 7 commits September 17, 2024 06:47

Detect system info

52e340a

Merge branch 'main' into feature-detection

b01a025

Prevent panic

3fe5193

cleanup

6b8313d

Merge branch 'main' into feature-detection

abdb93d

Merge branch 'main' into feature-detection

73d04a0

Hook up to main + some cleanup

109ff7f

patnebe changed the title ~~WIP: Detect system info + available BPF features on startup~~ Detect system info + available BPF features on startup Sep 23, 2024

Merge branch 'main' into feature-detection

15def31

patnebe marked this pull request as ready for review September 23, 2024 09:21

patnebe added 2 commits September 23, 2024 10:31

clippy fix

371fe01

Merge branch 'feature-detection' of github.com:patnebe/lightswitch in…

72ebbd6

…to feature-detection

javierhonduco reviewed Sep 23, 2024

View reviewed changes

patnebe added 5 commits September 25, 2024 17:57

Apply review feedback

6d93054

- use custom errors - use drop trait - update match to map_err + ?

Merge branch 'main' into feature-detection

e6c6f24

Hook into sched_switch event for feat detection

19c0d0b

Merge branch 'feature-detection' of github.com:patnebe/lightswitch in…

aebaebd

…to feature-detection

fix mount detection path

26eef57

patnebe and others added 7 commits September 26, 2024 07:38

Create top level lightswitch-sys-probe crate

20f7bde

Create top level lightswitch-sys-probe crate

4e9d7db

Delete lightswitch-sys-probe/src/bpf/features_skel.rs

ebd57a4

Signed-off-by: Okwudili Pat-Nebe <cp.nebe@gmail.com>

Merge branch 'feature-detection' of github.com:patnebe/lightswitch in…

e2ce4d5

…to feature-detection

missing newline

048ac80

Delete lightswitch-sys-probe/src/bpf/features_skel.rs

3f5cd25

Signed-off-by: Okwudili Pat-Nebe <cp.nebe@gmail.com>

nit

aa68f52

javierhonduco reviewed Sep 26, 2024

View reviewed changes

lightswitch-sys-probe/src/system_info.rs Outdated

})

}

pub fn has_minimal_requirements(&self) -> bool {

Copy link

Owner

javierhonduco Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

javierhonduco reviewed Sep 26, 2024

View reviewed changes

patnebe added 3 commits September 30, 2024 16:22

symlink vmlinux headers + remove ringbuf from min reqs

ac3ae35

Merge branch 'feature-detection' of github.com:patnebe/lightswitch in…

246459e

…to feature-detection

Rename crate to lightswitch-capabilities

9149ce5

javierhonduco reviewed Oct 1, 2024

View reviewed changes

javierhonduco approved these changes Oct 1, 2024

View reviewed changes

patnebe merged commit d510e79 into javierhonduco:main Oct 1, 2024
4 checks passed

patnebe deleted the feature-detection branch October 1, 2024 12:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect system info + available BPF features on startup #70

Detect system info + available BPF features on startup #70

patnebe commented Sep 17, 2024 •

edited

Loading

javierhonduco left a comment

javierhonduco Sep 23, 2024

javierhonduco Sep 23, 2024

patnebe Sep 26, 2024

javierhonduco Sep 23, 2024

javierhonduco Sep 23, 2024

javierhonduco Sep 23, 2024

patnebe commented Sep 26, 2024 •

edited

Loading

javierhonduco Sep 26, 2024

patnebe Sep 30, 2024 •

edited

Loading

javierhonduco Sep 26, 2024

javierhonduco Sep 26, 2024

patnebe Sep 30, 2024

javierhonduco commented Sep 26, 2024

javierhonduco Oct 1, 2024

javierhonduco left a comment

		@@ -0,0 +1,7 @@
		#ifdef __TARGET_ARCH_x86
		#include "vmlinux_x86.h"

Detect system info + available BPF features on startup #70

Detect system info + available BPF features on startup #70

Conversation

patnebe commented Sep 17, 2024 • edited Loading

Context

Changes

Test Plan

javierhonduco left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patnebe commented Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

patnebe Sep 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

javierhonduco commented Sep 26, 2024

Choose a reason for hiding this comment

javierhonduco left a comment

Choose a reason for hiding this comment

patnebe commented Sep 17, 2024 •

edited

Loading

patnebe commented Sep 26, 2024 •

edited

Loading

patnebe Sep 30, 2024 •

edited

Loading