-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pcirebind extension #488
Open
mwdomino
wants to merge
1
commit into
siderolabs:main
Choose a base branch
from
mwdomino:pcirebind
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+733
−0
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
## pcirebind | ||
|
||
This is a Talos Linux Extension that can be used to rebind the driver on a given PCI Bus ID. It is used internally to unbind NICs from `ixgbe` and apply the `vfio-pci` driver to enable VPP to take control of the devices. | ||
|
||
#### Applying to a Talos Linux server | ||
After embedding the extension in your Talos installer you'll need to modify the kernel arguments. | ||
|
||
Add additional kernel args under `.machine.install.extraKernelArgs` in the format of: | ||
|
||
``` yaml | ||
# 0000:04:00.00 is the PCI Bus ID | ||
# ixgbe is the driver to unbind | ||
# vfio-pci is the drive to bind | ||
machine: | ||
install: | ||
extensions: | ||
extraKernelArgs: | ||
- pcirebind.rebind=0000:04:00.00_ixgbe+vfio-pci | ||
- pcirebind.rebind=0000:04:00.01_ixgbe+vfio-pci | ||
``` | ||
|
||
Then trigger a reboot with `upgrade`: | ||
|
||
``` sh | ||
talosctl -e <endpoint> -n <node> --talosconfig=./talosconfig upgrade | ||
``` | ||
|
||
Once the server reboots you can check the status of the service with the following commands: | ||
|
||
``` sh | ||
talosctl -e <endpoint> -n <node> --talosconfig=./talosconfig service ext-pcirebind | ||
NODE 10.50.12.211 | ||
ID ext-pcirebind | ||
STATE Finished | ||
HEALTH ? | ||
EVENTS [Finished]: Service finished successfully (2m36s ago) | ||
[Running]: Started task ext-pcirebind (PID 4210) for container ext-pcirebind (2m37s ago) | ||
[Preparing]: Creating service runner (2m37s ago) | ||
[Preparing]: Running pre state (2m37s ago) | ||
[Waiting]: Waiting for file "/sys/bus/pci/drivers/vfio-pci/bind" to exist (2m42s ago) | ||
[Waiting]: Waiting for service "containerd" to be "up", file "/sys/bus/pci/drivers/vfio-pci/bind" to exist (2m44s ago) | ||
[Starting]: Starting service (2m44s ago) | ||
|
||
talosctl -e <endpoint> -n <node> --talosconfig=./talosconfig logs ext-pcirebind | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
version: v1alpha1 | ||
metadata: | ||
name: pcirebind | ||
version: "$VERSION" | ||
author: 46labs | ||
description: | | ||
This system extension provides a simple binary that can bind/unbind drivers from | ||
PCI devices by writing to: | ||
/sys/bus/pci/devices/<pci_bus_id>/driver_override | ||
/sys/bus/pci/drivers/<driver_name>/unbind | ||
/sys/bus/pci/drivers/<driver_name>/bind | ||
|
||
This is accomplished using a system extension as the /sys/ filesystem is read-only | ||
during normal operations. | ||
|
||
The binary parses the kernel command line (/proc/cmdline) looking for embedded strings | ||
like the following: | ||
pcirebind.rebind=0000:04:00.0_ixgbe+vfio-pci | ||
|
||
This example would attempt to unbind `ixgbe` from `0000:04:00.0` and then bind `vfio-pci` | ||
to the device | ||
|
||
compatibility: | ||
talos: | ||
version: ">= v1.8.0" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
name: pcirebind | ||
container: | ||
entrypoint: ./pcirebind | ||
security: | ||
writeableRootfs: true | ||
writeableSysfs: true | ||
rootfsPropagation: shared | ||
depends: | ||
- path: /sys/bus/pci/drivers/vfio-pci/bind | ||
restart: untilSuccess |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
name: pcirebind | ||
variant: alpine | ||
shell: /toolchain/bin/bash | ||
dependencies: | ||
- image: ghcr.io/siderolabs/tools:v1.8.0-2-g7719230 | ||
runtime: false | ||
to: / | ||
steps: | ||
- env: | ||
GOPATH: /go | ||
cachePaths: | ||
- /.cache/go-build | ||
- /go/pkg | ||
prepare: | ||
- | | ||
sed -i 's#$VERSION#{{ .VERSION }}#' /pkg/manifest.yaml | ||
build: | ||
- | | ||
export PATH=${PATH}:${TOOLCHAIN}/go/bin | ||
|
||
cd /pkg/src | ||
CGO_ENABLED=0 go build -o ./pcirebind . | ||
install: | ||
- | | ||
mkdir -p /rootfs/usr/local/lib/containers/pcirebind | ||
|
||
cp -p /pkg/src/pcirebind /rootfs/usr/local/lib/containers/pcirebind/ | ||
- | | ||
mkdir -p /rootfs/usr/local/etc/containers | ||
|
||
cp /pkg/pcirebind.yaml /rootfs/usr/local/etc/containers/ | ||
test: | ||
- | | ||
mkdir -p /extensions-validator-rootfs | ||
cp -r /rootfs/ /extensions-validator-rootfs/rootfs | ||
cp /pkg/manifest.yaml /extensions-validator-rootfs/manifest.yaml | ||
finalize: | ||
- from: /rootfs | ||
to: /rootfs | ||
- from: /pkg/manifest.yaml | ||
to: / |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
module github.com/46labs/talos-pcirebind | ||
|
||
go 1.22.7 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,154 @@ | ||
package main | ||
|
||
import ( | ||
"fmt" | ||
"os" | ||
"strings" | ||
) | ||
|
||
type rebind struct { | ||
id string | ||
oldDriver string | ||
newDriver string | ||
} | ||
|
||
// writeToSysFile writes to the specified sysfs file | ||
func writeToSysFile(path, content string) error { | ||
file, err := os.OpenFile(path, os.O_WRONLY, 0200) // 0200 is write-only permission | ||
if err != nil { | ||
e := fmt.Errorf("failed to open %s: %v", path, err) | ||
fmt.Printf("[error] %v", e) | ||
|
||
return e | ||
} | ||
defer file.Close() | ||
|
||
if _, err = file.WriteString(content); err != nil { | ||
e := fmt.Errorf("failed to write to %s: %v", path, err) | ||
fmt.Printf("[error] %v", e) | ||
|
||
return e | ||
} | ||
|
||
return nil | ||
} | ||
|
||
// parseKernelCmdline parses specially formatted strings out of /proc/cmdline | ||
// | ||
// The format is as follows: | ||
// pcirebind.rebind=<pci_bus_id>_<current_driver>+<new_driver> | ||
// | ||
// example: | ||
// | ||
// pcirebind.rebind=0000:04:00.0_ixgbe+vfio-pci | ||
func parseKernelCmdline(readFunc func(string) ([]byte, error)) ([]rebind, error) { | ||
data, err := readFunc("/proc/cmdline") | ||
if err != nil { | ||
return nil, fmt.Errorf("failed to read /proc/cmdline: %v", err) | ||
} | ||
|
||
cmdline := string(data) | ||
parts := strings.Fields(cmdline) | ||
|
||
var rebindsList []rebind | ||
|
||
for _, part := range parts { | ||
if strings.HasPrefix(part, "pcirebind.rebind=") { | ||
// Extract the rebind specification | ||
spec := strings.TrimPrefix(part, "pcirebind.rebind=") | ||
|
||
// Split the specification into id, oldDriver, and newDriver | ||
parts := strings.Split(spec, "_") | ||
if len(parts) != 2 { | ||
fmt.Printf("Invalid rebind format: %s\n", spec) | ||
continue | ||
} | ||
id, drivers := parts[0], parts[1] | ||
|
||
driverParts := strings.Split(drivers, "+") | ||
if len(driverParts) != 2 { | ||
fmt.Printf("Invalid drivers format: %s\n", drivers) | ||
continue | ||
} | ||
oldDriver, newDriver := driverParts[0], driverParts[1] | ||
|
||
// Append the parsed rebind information to the list | ||
rebindsList = append(rebindsList, rebind{ | ||
id: id, | ||
oldDriver: oldDriver, | ||
newDriver: newDriver, | ||
}) | ||
} | ||
} | ||
|
||
if len(rebindsList) > 0 { | ||
return rebindsList, nil | ||
} | ||
|
||
return nil, fmt.Errorf("no rebinds found in kernel command line") | ||
} | ||
|
||
// overrideDriver sets the `driver_override` for a given PCI Bus ID | ||
// this is analogous to: | ||
// | ||
// echo "vfio-pci" > /sys/bus/pci/devices/0000:04:00.0/driver_override | ||
func (r *rebind) overrideDriver() error { | ||
bindPath := fmt.Sprintf("/sys/bus/pci/devices/%s/driver_override", r.id) | ||
return writeToSysFile(bindPath, r.newDriver) | ||
} | ||
|
||
// unbindDriver writes to `unbind` a given PCI Bus ID | ||
// this is analogous to: | ||
// | ||
// echo "0000:04:00.0" > /sys/bus/pci/drivers/ixgbe/unbind | ||
func (r *rebind) unbindDriver() error { | ||
bindPath := fmt.Sprintf("/sys/bus/pci/drivers/%s/unbind", r.oldDriver) | ||
return writeToSysFile(bindPath, r.id) | ||
} | ||
|
||
// bindDriver writes to `bind` a given PCI Bus ID | ||
// this is analogous to: | ||
// | ||
// echo "0000:04:00.0" > /sys/bus/pci/drivers/vfio-pci/bind | ||
func (r *rebind) bindDriver() error { | ||
bindPath := fmt.Sprintf("/sys/bus/pci/drivers/%s/bind", r.newDriver) | ||
return writeToSysFile(bindPath, r.id) | ||
} | ||
|
||
// v8 was last working version | ||
func main() { | ||
rebinds, err := parseKernelCmdline(os.ReadFile) | ||
if err != nil { | ||
fmt.Printf("Error parsing kernel command line: %v\n", err) | ||
|
||
os.Exit(1) | ||
} | ||
|
||
anyError := false | ||
|
||
for _, rebind := range rebinds { | ||
if err := rebind.overrideDriver(); err != nil { | ||
fmt.Printf("Error writing to `driver-override`: %v\n", err) | ||
|
||
continue | ||
} | ||
|
||
if err := rebind.unbindDriver(); err != nil { | ||
fmt.Printf("Error writing to `unbind`: %v\n", err) | ||
|
||
continue | ||
} | ||
|
||
if err := rebind.bindDriver(); err != nil { | ||
fmt.Printf("Error writing to `bind`: %v\n", err) | ||
|
||
anyError = true | ||
|
||
continue | ||
} | ||
} | ||
|
||
if anyError == true { | ||
os.Exit(1) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
package main | ||
|
||
import ( | ||
"fmt" | ||
"strings" | ||
"testing" | ||
) | ||
|
||
func mockReadFile(data string) func(string) ([]byte, error) { | ||
return func(filename string) ([]byte, error) { | ||
if filename == "/proc/cmdline" { | ||
return []byte(data), nil | ||
} | ||
return nil, fmt.Errorf("unexpected file read: %s", filename) | ||
} | ||
} | ||
|
||
func TestParseKernelCmdline_ValidInput(t *testing.T) { | ||
readFile := mockReadFile("pcirebind.rebind=0000:04:00.0_ixgbe+vfio-pci pcirebind.rebind=0000:04:01.0_ixgbe+vfio-pci") | ||
|
||
expected := []rebind{ | ||
{id: "0000:04:00.0", oldDriver: "ixgbe", newDriver: "vfio-pci"}, | ||
{id: "0000:04:01.0", oldDriver: "ixgbe", newDriver: "vfio-pci"}, | ||
} | ||
|
||
result, err := parseKernelCmdline(readFile) | ||
if err != nil { | ||
t.Fatalf("Expected no error, got: %v", err) | ||
} | ||
|
||
if len(result) != len(expected) { | ||
t.Fatalf("Expected %d rebinds, got %d", len(expected), len(result)) | ||
} | ||
|
||
for i, r := range result { | ||
if r != expected[i] { | ||
t.Errorf("Expected rebind %+v, got %+v", expected[i], r) | ||
} | ||
} | ||
} | ||
|
||
func TestParseKernelCmdline_InvalidInput_NoDriverSeparator(t *testing.T) { | ||
readFile := mockReadFile("pcirebind.rebind=0000:04:00.0_ixgbevfio-pci") | ||
|
||
_, err := parseKernelCmdline(readFile) | ||
if err == nil || !strings.Contains(err.Error(), "no rebinds found in kernel command line") { | ||
t.Fatalf("Expected error for invalid input, got: %v", err) | ||
} | ||
} | ||
|
||
func TestParseKernelCmdline_InvalidInput_NoUnderscoreSeparator(t *testing.T) { | ||
readFile := mockReadFile("pcirebind.rebind=0000:04:00.0+vfio-pci") | ||
|
||
_, err := parseKernelCmdline(readFile) | ||
if err == nil || !strings.Contains(err.Error(), "no rebinds found in kernel command line") { | ||
t.Fatalf("Expected error for invalid input, got: %v", err) | ||
} | ||
} | ||
|
||
func TestParseKernelCmdline_InvalidInput_EmptyInput(t *testing.T) { | ||
readFile := mockReadFile("") | ||
|
||
_, err := parseKernelCmdline(readFile) | ||
if err == nil || !strings.Contains(err.Error(), "no rebinds found in kernel command line") { | ||
t.Fatalf("Expected error for empty input, got: %v", err) | ||
} | ||
} | ||
|
||
func TestParseKernelCmdline_ValidAndInvalidMixed(t *testing.T) { | ||
readFile := mockReadFile("pcirebind.rebind=0000:04:00.0_ixgbe+vfio-pci pcirebind.rebind=0000:04:01.0_invalidinput") | ||
|
||
expected := []rebind{ | ||
{id: "0000:04:00.0", oldDriver: "ixgbe", newDriver: "vfio-pci"}, | ||
} | ||
|
||
result, err := parseKernelCmdline(readFile) | ||
if err != nil { | ||
t.Fatalf("Expected no error, got: %v", err) | ||
} | ||
|
||
if len(result) != len(expected) { | ||
t.Fatalf("Expected %d valid rebind, got %d", len(expected), len(result)) | ||
} | ||
|
||
for i, r := range result { | ||
if r != expected[i] { | ||
t.Errorf("Expected rebind %+v, got %+v", expected[i], r) | ||
} | ||
} | ||
} |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could probably use
go-procfs
from here: https://github.com/siderolabs/go-procfsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smira i wonder if this needs to be an extension or core talos? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I almost wonder if that should be actually a controller, and operate on a proper machine config document