diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md new file mode 100644 index 000000000000..f6fa6a216aa2 --- /dev/null +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -0,0 +1,27 @@ +## Description + + + +## Why this approach + + + +--- + +## Checklist + +- [ ] Commit prefix matches changed area (e.g., `tools/toolname:`, `libbpf-tools/toolname:`, `src/cc:`, `doc:`, `build:`, `tests/python:`) +- [ ] Commit body explains **why** this change is needed + +**For new tools only** +- [ ] Explains why this tool is needed and what existing tools cannot cover this use case +- [ ] Includes at least one real production use case +- [ ] Man page (`man/man8/`) with OVERHEAD and CAVEATS sections +- [ ] Example output file (`*_example.txt`) +- [ ] README.md entry added +- [ ] Smoke test added to `tests/python/test_tools_smoke.py` + +--- + +> **Note:** The maintainer may request a Copilot code review on this PR. +> AI feedback is advisory only β€” reply with your reasoning if you disagree; the maintainer decides. diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 000000000000..9f7920cd4d2e --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,54 @@ +# BCC Project β€” GitHub Copilot Instructions + +BCC is a toolkit for creating efficient kernel tracing and manipulation programs using eBPF. Tools run in **mission-critical environments as root**. + +## Global Review Principles (Applied to all PRs) + +### Commit Message Format + +``` +: + + +- Problem being solved +- Why this approach was chosen +``` + +**Prefixes:** `tools/toolname:`, `libbpf-tools/toolname:`, `src/cc:`, `build:`, `ci:`, `doc:`, `tests/python:` + +### Style Checks + +- **Python:** `scripts/py-style-check.sh` (pycodestyle, ignore E123/E125/E126/E127/E128/E302) +- **C/C++:** `scripts/c-style-check.sh` (git clang-format against master) + + +- MUST perform a NULL check after any BPF C Map lookup. +- MUST perform a NULL check after every `malloc()`, `calloc()`, `realloc()`, and `strdup()` call in userspace C code. +- MUST perform a bounds check for all array accesses. +- BPF C functions: flag if stack usage appears to approach or exceed 512 bytes (eBPF verifier hard limit). +- Default output format MUST be under 80 characters wide. + + +### Documentation Requirements (New Tools) + +All **new tools** require these **minimum** files (enforce as blocker): +1. Tool script +2. Man page (`man/man8/`) with **OVERHEAD** and **CAVEATS** sections +3. `README.md` entry + +Additional per-subsystem requirements apply β€” defer to the relevant +`instructions/*.instructions.md` file (e.g., `tools.instructions.md` also +requires `tests/python/test_tools_smoke.py`; `*_example.txt` is required for +`tools/` but recommended for `libbpf-tools/`). + +For **bug-fix or enhancement PRs on existing tools**: flag missing docs as 🟑 Warning, not a blocker. +> Note: ~14% of libbpf-tools currently ship without a man page β€” this is a known gap, not a reason to skip the requirement for new tools. + +### Unix Philosophy + +- Do one thing and do it well +- Default output **< 80 characters wide** +- Prefer short tool names; avoid underscores for new tools unless needed for + clarity or to match an existing naming pattern (e.g., `mysqld_qslower`) +- Prefer a positional argument for the most common parameter (e.g., interval) over a flag, + where it makes sense for the tool's use case diff --git a/.github/instructions/core.instructions.md b/.github/instructions/core.instructions.md new file mode 100644 index 000000000000..67158ce7bce6 --- /dev/null +++ b/.github/instructions/core.instructions.md @@ -0,0 +1,66 @@ +--- +applyTo: "src/cc/**" +--- + +# BCC Core Library Review Instructions + +All BCC tools depend on this code β€” stability and backward compatibility are critical. + + +- MUST NOT break public C++ APIs without a deprecation cycle. +- When changing a C++ function signature, MUST update `src/python/bcc/__init__.py` ctypes bindings. +- MUST NULL-check every `malloc()`, `calloc()`, `realloc()`, and `strdup()` call. +- MUST use negative errno consistently for error returns. +- MUST guard all architecture-specific code with `#ifdef __x86_64__` / `#ifdef __aarch64__` etc. + + +## API & ABI Stability + +- Deprecate gracefully: add `[[deprecated(...)]]` and a one-time `fprintf(stderr, "Warning: …")` in the old function body +- All new C++ APIs must be exposed to Python via ctypes; `argtypes` / `restype` must exactly match the C++ signature +- Handle `bytes` vs `str` encoding for Python 3 in all string-passing paths + +## Memory & Resource Safety + +- Use RAII / smart pointers (`std::unique_ptr`, `std::shared_ptr`) β€” no raw owning pointers +- Every allocation freed on **all** paths, including error paths (no FD/memory leaks) +- Thread-shared state protected with mutexes or atomics; document thread-safety guarantees + +## LLVM/Clang Compatibility + +- Check minimum LLVM version in `CMakeLists.txt` before using new APIs +- Gate version-specific code with `#if LLVM_VERSION_MAJOR >= N` + +## Build System + +- New optional dependencies guarded with `find_package` + `#ifdef HAVE_*` +- New deps added to both `CMakeLists.txt` and `debian/control` + +## Documentation + +- Update `docs/reference_guide.md` for new or changed public APIs +- Public functions: Doxygen-style comments (`@param`, `@return`) + +## Review Checklist + +- [ ] Public C++ API unchanged or deprecated gracefully +- [ ] Python bindings updated to match any C++ signature change +- [ ] No memory/FD leaks; RAII used +- [ ] NULL checks after every `malloc`/`calloc`/`realloc`/`strdup` +- [ ] Error handling consistent (negative errno / `StatusTuple`) +- [ ] Thread safety considered for shared state +- [ ] Architecture-specific code guarded with `#ifdef` +- [ ] LLVM version compatibility maintained +- [ ] `docs/reference_guide.md` updated for new public APIs +- [ ] Build system changes correct (optional deps guarded) +- [ ] Code style consistent (run `scripts/c-style-check.sh`) + +## Red Flags β€” Always Flag + +1. Breaking C++ API change without deprecation +2. C++ signature changed but Python bindings not updated +3. Memory or FD leak (missing `close()`, `free()`, destructor) +4. Missing NULL check after allocation +5. Thread-safety violation on shared state +6. Platform-specific code without `#ifdef` guard +7. New LLVM API used without version guard diff --git a/.github/instructions/examples.instructions.md b/.github/instructions/examples.instructions.md new file mode 100644 index 000000000000..cad42ae14be7 --- /dev/null +++ b/.github/instructions/examples.instructions.md @@ -0,0 +1,74 @@ +--- +applyTo: "examples/**" +--- + +# BCC Examples Review Instructions + +Examples are **educational** β€” prioritize clarity over production robustness. + + +- Focus on **one concept** per example; target **< 150 lines** total. +- Every major BPF step MUST have an inline comment explaining **why**, not just what. +- Header comment MUST describe the concept demonstrated and usage. +- Do NOT add complex argument parsing or production-grade error handling β€” it obscures the learning point. +- License header MUST be present. + + +## Required Header Comment + +Every example must start with: +``` +# example_name.py Brief one-line description +# Demonstrates: [what BCC/eBPF concept this shows] +# USAGE: example_name.py +# Copyright [year] [author] / Licensed under Apache 2.0 +``` + +## Pedagogical Quality + +- One BCC concept per example; builds naturally on simpler ones +- Clear learning objective; do not mix maps + arrays + perf buffers + USDT in one example +- Output is labeled (column headers); explain what's being traced +- Minimal error handling: catch `BPF()` failure and `KeyboardInterrupt` only + +## Kernel Compatibility + +- Note kernel requirements in a comment when using features requiring β‰₯ 4.x +- Use `BPF.kernel_struct_has_field()` for runtime field detection; never hard-code kernel versions + +## File Organization + +- `networking/` β€” network-related examples +- `tracing/` β€” kernel/userspace tracing +- `usdt_sample/` β€” USDT examples +- `lua/` β€” Lua API examples +- `cpp/` β€” C++ API examples + +## What Examples Do NOT Require + +Unlike `tools/`, examples do **not** need: +- Man pages, `*_example.txt` files, README.md entries (optional) +- Comprehensive argparse argument handling +- Overhead documentation + +## Review Checklist + +- [ ] ≀ 150 lines; focuses on a single BCC concept +- [ ] Inline comments explain each BPF step +- [ ] Header comment describes purpose and concept demonstrated +- [ ] License header present +- [ ] Output is labeled and explained +- [ ] Basic error handling present (BPF compile failure, KeyboardInterrupt) +- [ ] Correct subdirectory placement +- [ ] Python 3 compatible +- [ ] No undocumented external dependencies + +## Red Flags β€” Always Flag + +1. > 150 lines or mixes too many concepts (belongs in `tools/` instead) +2. Missing inline comments on BPF logic +3. No header comment describing the concept demonstrated +4. Missing license header +5. No output or unexplained/unlabeled output +6. Python 2-only code (`print "..."`, `except Exception, e:`) +7. Undocumented external Python dependencies diff --git a/.github/instructions/instructions.instructions.md b/.github/instructions/instructions.instructions.md new file mode 100644 index 000000000000..069c8bab8c7c --- /dev/null +++ b/.github/instructions/instructions.instructions.md @@ -0,0 +1,40 @@ +--- +applyTo: ".github/instructions/*.instructions.md" +--- + +# Authoring BCC Instruction Files + +These files define Copilot review rules for the BCC project. +When editing them, follow the steps below to avoid writing rules that contradict +the actual codebase. + +## Before Writing or Updating Any Rule + +1. **API / function examples** β€” read the actual source before writing: + - Python BCC API β†’ check `src/python/bcc/table.py` + - BPF helper signatures β†’ check `libbpf-tools/*.bpf.c` examples + - Userspace libbpf patterns β†’ check `libbpf-tools/*.c` examples + +2. **Conventions (shebang, imports, prefixes, etc.)** β€” sample the real files: + - `tools/*.py` for Python conventions (shebang) + - `libbpf-tools/*.c` / `*.bpf.c` for C conventions + - `git log --oneline origin/master | head -30` for commit prefix convention (format: `subsystem/toolname:`) + +3. **Do not invent or assume** an API method, macro, or convention exists β€” + verify it in the repo first. + +## Scope of Each Instructions File + +| File | `applyTo` | Triggers when… | +|------|-----------|----------------| +| `tools.instructions.md` | `tools/**/*.py` | editing a Python tool | +| `libbpf-tools.instructions.md` | `libbpf-tools/**/*` | editing a libbpf tool | +| `core.instructions.md` | `src/cc/**` | editing BCC core library | +| `examples.instructions.md` | `examples/**` | editing an example | +| `instructions.instructions.md` | `.github/instructions/*.instructions.md` | editing these rule files | + +## Style + +- Keep `` short β€” Copilot has a ~4,000 character review window +- One rule per bullet; no redundancy across files +- Flag blockers explicitly; use 🟑 Warning for non-blockers diff --git a/.github/instructions/libbpf-tools.instructions.md b/.github/instructions/libbpf-tools.instructions.md new file mode 100644 index 000000000000..f56a45de13f0 --- /dev/null +++ b/.github/instructions/libbpf-tools.instructions.md @@ -0,0 +1,83 @@ +--- +applyTo: "libbpf-tools/**/*" +--- + +# libbpf-tools (CO-RE) Review Instructions + +These are CO-RE (Compile Once - Run Everywhere) tools using libbpf. + + +- MUST use `vmlinux.h` for kernel types β€” do NOT redefine structs manually. +- MUST use `BPF_CORE_READ` (or `BPF_CORE_READ_USER` for user-space pointers) + for kernel struct field access β€” no direct `task->pid` style access. +- MUST NULL-check every BPF map lookup result before dereferencing. +- MUST NULL-check every `malloc()`, `calloc()`, `realloc()`, and `strdup()` in userspace C. +- MUST bounds-check every array index before access. +- BPF functions: flag if stack usage appears to approach or exceed 512 bytes (eBPF verifier hard limit). +- Use `bpf_core_field_exists()` for kernel version compatibility β€” never `#if LINUX_VERSION_CODE`. +- Use split lifecycle: `__open()` β†’ configure rodata/map sizes β†’ `__load()` β†’ `__attach()`. + Flag `open_and_load()` if rodata fields or map max_entries are configured before load. +- Check return values of ALL attachment calls (`bpf_program__attach_*`). +- Do NOT use old-style map definitions (`bpf_map_def SEC("maps")`). +- Do NOT use hard-coded kernel version numbers or struct offsets. +- Do NOT create duplicate BPF programs with identical logic β€” use `bpf_program__set_attach_target()` instead. +- When providing both fentry and kprobe fallback paths: both paths must attach to the same + set of kernel functions. Use `bpf_program__set_attach_target()` in the kprobe path to + match the fentry path's attach targets. + + +## libbpf Object Lifecycle + +- Always split: `__open()` β†’ set rodata/map config β†’ `__load()` β†’ `__attach()` +- Flag any use of `open_and_load()` where rodata or map `max_entries` are configured +- Check all return values; use `goto cleanup` pattern +- All resources (skel, FDs, links) freed on all exit paths including errors + +## BPF Memory Safety + +- NULL-check every `bpf_map_lookup_elem()` result before dereferencing +- Bounds-check every array index: `if (idx >= MAX_ENTRIES) return 0;` +- Check `bpf_probe_read_kernel()` return value: `if (ret < 0) return 0;` +- Keep per-function BPF stack usage well under 512 bytes; use per-CPU maps for large structs +- String reads: always use bounded helpers (`bpf_probe_read_kernel_str`, `bpf_get_current_comm`) + +## Userspace Rules + +- Output: default **< 80 characters wide** +- Error messages: clear, actionable, include `strerror(errno)` where applicable +- Map FD: check `bpf_map__fd()` result is β‰₯ 0 before use +- Use existing helpers (`trace_helpers.h`, `map_helpers.h`) β€” don't duplicate + +## Required Files (New Tools) + +### Code +1. `libbpf-tools/tool.bpf.c` β€” BPF program +2. `libbpf-tools/tool.c` β€” userspace program +3. `libbpf-tools/tool.h` β€” shared header (if needed) +4. Makefile entry for skeleton generation + +### Documentation (enforce as blocker) +5. `man/man8/tool.8` β€” with **OVERHEAD** and **CAVEATS** sections +6. `README.md` β€” entry added +7. `libbpf-tools/tool_example.txt` β€” example output *(recommended; may be omitted if an + equivalent Python tools/ example already exists)* + +## Review Checklist + +- [ ] CO-RE: `vmlinux.h` used; `BPF_CORE_READ` family for all kernel struct access +- [ ] Lifecycle: split open β†’ configure β†’ load β†’ attach (flag premature `open_and_load`) +- [ ] BPF memory safety: NULL checks after map lookups, bounds checks, stack well under 512 bytes +- [ ] Userspace: NULL checks after all `malloc`/`calloc`/`realloc`/`strdup` +- [ ] All attach/map FD return values checked +- [ ] Resources freed on all paths (`goto cleanup`) +- [ ] BTF-style map definitions (no `bpf_map_def`) +- [ ] No hard-coded kernel versions or offsets +- [ ] No duplicate BPF programs; fentry and kprobe paths cover same attach targets +- [ ] Output < 80 chars wide +- [ ] Makefile skeleton generation entry present +- [ ] Documentation: man page + README entry (new tools); example file recommended + +## References + +- [libbpf Documentation](https://github.com/libbpf/libbpf) +- [BPF CO-RE Reference](https://nakryiko.com/posts/bpf-portability-and-co-re/) diff --git a/.github/instructions/tools.instructions.md b/.github/instructions/tools.instructions.md new file mode 100644 index 000000000000..93f14eb3a23d --- /dev/null +++ b/.github/instructions/tools.instructions.md @@ -0,0 +1,33 @@ +--- +applyTo: "tools/**/*.py" +--- + +# BCC Tools (Python/BCC API) Review Instructions + +Tools run in **mission-critical environments as root** β€” correctness and safety are mandatory. + + +- BPF C code: MUST NULL-check every `map.lookup(&key)` result before dereferencing. +- BPF C code: MUST bounds-check every array index before access. +- BPF C code: flag if stack usage appears to approach or exceed 512 bytes (eBPF verifier hard limit). +- Default output MUST be under 80 characters wide. +- New tools MUST include man page (`man/man8/`) with **OVERHEAD** and **CAVEATS** sections. + + +## BCC API Safety + +- Map lookup: use `table[key]` with `try/except KeyError`, or `table.get(key)` β€” check result is not `None` before use +- BPF C macro `map.lookup(&key)` returns a pointer β€” NULL means key not found; always guard before dereference +- Prefer map-based aggregation over per-event output for high-frequency events; filter in BPF, not Python + +## Required Documentation (New Tools) + +1. `man/man8/toolname.8` β€” with **OVERHEAD** and **CAVEATS** sections +2. `tools/toolname_example.txt` β€” example output +3. `README.md` β€” entry added +4. `tests/python/test_tools_smoke.py` β€” smoke test entry + +## Kernel Compatibility + +- Use `BPF.kernel_struct_has_field()` for runtime struct field detection β€” never hard-code kernel version numbers +- New options must not break existing default behavior diff --git a/.github/prompts/check-bpf-safety.prompt.md b/.github/prompts/check-bpf-safety.prompt.md new file mode 100644 index 000000000000..4542c7553841 --- /dev/null +++ b/.github/prompts/check-bpf-safety.prompt.md @@ -0,0 +1,61 @@ +# BPF Safety Check + +Perform a focused BPF safety and verifier-compliance check on the provided BPF C code. + +## Memory Safety Checks + +For each BPF map operation, verify: + +```c +// Required pattern for map lookups: +struct val_t *val = map.lookup(&key); // BCC C macro (only inside BPF programs, not Python) +if (!val) // ← MUST exist + return 0; +val->field = ...; // Safe to access now + +// libbpf BPF helper style (BPF program, libbpf-tools): +struct val_t *val = bpf_map_lookup_elem(&map, &key); +if (!val) // ← MUST exist + return 0; +``` + +For each array/buffer access, verify: +```c +u32 idx = ...; +if (idx >= MAX_ENTRIES) // ← MUST exist + return 0; +array[idx] = value; +``` + +For each `bpf_probe_read_*` / `bpf_probe_read_kernel_str`, verify the return value is checked. + +## Stack Usage Check + +- Estimate total stack usage (sum of all local variables in each BPF function) +- Flag if any single function likely exceeds **512 bytes** +- Suggest moving large structs into per-CPU maps + +## CO-RE Compliance (libbpf-tools only) + +- All kernel struct accesses use `BPF_CORE_READ()` β€” flag any `task->pid` style access +- Kernel version checks use `bpf_core_field_exists()` β€” flag any `#if LINUX_VERSION_CODE` +- No manual struct redefinitions β€” all types come from `vmlinux.h` + +## Helper Function Usage + +For each BPF helper call, verify: +- Arguments are of the correct type (no passing user pointers to kernel-only helpers) +- Return values are checked where the helper can fail +- `bpf_get_current_comm()`, `bpf_probe_read_*`, etc. use `sizeof()` for size arguments + +## Output Format + +List each finding as: + +| Severity | Location | Issue | Fix | +|---|---|---|---| +| πŸ”΄ Critical | `file.bpf.c:42` | Missing NULL check after map lookup | Add `if (!val) return 0;` | +| 🟑 Warning | `file.bpf.c:78` | Unchecked bpf_probe_read_kernel return | Check `ret < 0` | +| πŸ”΅ Info | `file.bpf.c:12` | Stack struct ~400 bytes, approaching limit | Monitor; move to map if adding fields | + +End with: **PASS** (no critical issues) or **FAIL** (critical issues found). diff --git a/.github/prompts/review-pr.prompt.md b/.github/prompts/review-pr.prompt.md new file mode 100644 index 000000000000..52684094d404 --- /dev/null +++ b/.github/prompts/review-pr.prompt.md @@ -0,0 +1,73 @@ +# BCC PR Code Review + +Review the changes in this pull request against the BCC project standards. + +## Step 1: Identify Changed File Categories + +Determine which categories apply based on changed files: +- `tools/**/*.py` β†’ apply tools/ rules +- `libbpf-tools/**/*` β†’ apply libbpf-tools/ rules +- `src/cc/**` β†’ apply core library rules +- `examples/**` β†’ apply examples rules + +## Step 2: General Checks (All PRs) + +### Commit Message +- [ ] Has correct prefix (e.g., `tools/toolname:`, `libbpf-tools/toolname:`, `src/cc:`, `doc:`, `build:`, `tests/python:`) +- [ ] Body explains **WHY** the change is needed, not just what changed + +## Step 3: Category-Specific Checks + +### If tools/*.py changed: +- [ ] BPF C code: NULL checks after every map lookup +- [ ] BPF C code: bounds checks before array access +- [ ] Default output < 80 chars wide +- [ ] Filtering done in BPF, not Python +- [ ] Man page (`man/man8/`), example file, README entry present (new tools β€” blocker) +- [ ] Smoke test added to `tests/python/test_tools_smoke.py` (new tools) + +### If libbpf-tools/* changed: +- [ ] Uses `BPF_CORE_READ()` β€” no direct struct field access +- [ ] Uses `vmlinux.h` β€” no manual struct redefinitions +- [ ] Split open β†’ load β†’ attach lifecycle (flag `open_and_load()` if rodata is configured) +- [ ] NULL checks after all map lookups +- [ ] NULL checks after all `malloc()`, `realloc()`, `strdup()` calls +- [ ] Bounds checks before all array accesses +- [ ] BPF stack usage ≀ 512 bytes +- [ ] BTF-style map definitions (not old-style `bpf_map_def`) +- [ ] All resources freed on all paths (`goto cleanup`) +- [ ] Return values of all attach calls checked (`bpf_program__attach_*`) +- [ ] No hard-coded kernel version numbers or struct offsets +- [ ] Makefile entry for skeleton generation +- [ ] fentry and kprobe handlers are symmetric (same functions in both) +- [ ] No duplicate BPF programs (prefer runtime attach target selection) +- [ ] Man page (`man/man8/`), README entry present (new tools β€” blocker) + +### If src/cc/** changed: +- [ ] Public C++ API unchanged or deprecated gracefully +- [ ] Python bindings updated if C++ signature changed +- [ ] No memory leaks (RAII, smart pointers); NULL checks after all allocations +- [ ] LLVM version compatibility maintained (`#if LLVM_VERSION_MAJOR >= N`) +- [ ] `docs/reference_guide.md` updated for new public APIs + +### If examples/** changed: +- [ ] ≀ 150 lines; focuses on a single BCC concept +- [ ] Inline comments explain BPF logic +- [ ] Header comment with purpose and concept demonstrated +- [ ] License header present + +## Step 4: Output Format + +Start your response with the headings and table below (no introductory text before `### πŸ“ Review Summary`). + +### πŸ“ Review Summary + +[Add overall assessment and summary of the review here] + +### πŸ” Detailed Findings + +| Severity | Location | Issue | Recommendation | +|---|---|---|---| +| πŸ”΄ Critical
🟑 Warning
πŸ”΅ Info | `file.c:line` | [Description of the issue] | `[Suggested code snippet to fix]` | + +**Overall Assessment:** Approve / Request Changes / Comment