fix: race condition in tests involving Kea (2)#315
fix: race condition in tests involving Kea (2)#315pbreton wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Updates the Kea test harness in carbide-dhcp to avoid PID/lock-file collisions when tests run in parallel, targeting newer Kea versions on Ubuntu.
Changes:
- Renames the per-test temp directory handle and introduces explicit
pidfile_dir/lockfile_dirfields. - Sets
KEA_PIDFILE_DIR/KEA_LOCKFILE_DIRfrom those new fields when spawningkea-dhcp4.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
crates/dhcp/tests/common/kea.rs
Outdated
| // The config file is stored in this directory | ||
| _temp_base_directory: TempDir, | ||
|
|
||
| // Directories for PID and lock files that AppArmor allows |
There was a problem hiding this comment.
The comment says these are "Directories for PID and lock files that AppArmor allows", but the implementation points them at a per-test temp directory and later notes this requires AppArmor complain mode. Please update the wording to avoid implying this path is AppArmor-allowed by default (or adjust the implementation to actually use an AppArmor-allowed location).
| // Directories for PID and lock files that AppArmor allows | |
| // Directories for PID and lock files used by Kea during tests; these paths must be | |
| // permitted by the AppArmor profile or used while AppArmor is in complain mode. |
crates/dhcp/tests/common/kea.rs
Outdated
| // Use the unique temp directory for each test instance to avoid conflicts | ||
| // when tests run in parallel. This requires AppArmor to be in complain mode. | ||
| let pidfile_dir = _temp_base_directory.path().to_path_buf(); | ||
| let lockfile_dir = _temp_base_directory.path().to_path_buf(); |
There was a problem hiding this comment.
pidfile_dir and lockfile_dir are initialized to the same value; you can avoid the duplicate to_path_buf() call by cloning a single PathBuf (or by keeping one field and reusing it for both env vars).
| let lockfile_dir = _temp_base_directory.path().to_path_buf(); | |
| let lockfile_dir = pidfile_dir.clone(); |
crates/dhcp/tests/common/kea.rs
Outdated
| pub fn run(&mut self) -> Result<(), eyre::Report> { | ||
| let mut process = Command::new("/usr/sbin/kea-dhcp4") | ||
| .env("KEA_PIDFILE_DIR", self.temp_base_directory.path()) | ||
| .env("KEA_LOCKFILE_DIR", self.temp_base_directory.path()) | ||
| .env("KEA_PIDFILE_DIR", &self.pidfile_dir) | ||
| .env("KEA_LOCKFILE_DIR", &self.lockfile_dir) |
There was a problem hiding this comment.
This change appears functionally equivalent to the previous behavior: KEA_PIDFILE_DIR/KEA_LOCKFILE_DIR were already set to a per-instance TempDir via self.temp_base_directory.path(), and now they’re set to the same TempDir path via pidfile_dir/lockfile_dir. If the failure is still producing /tmp/kea-dhcp4.kea-dhcp4.pid, something else is likely causing Kea to ignore these env vars (e.g., being unable to write to the provided directory). Please clarify in code/PR what behavior changed here that eliminates the shared PID file, or adjust the implementation so the PID/lock path actually changes for Kea in the failing environment.
crates/dhcp/tests/common/kea.rs
Outdated
| // Hold this around so that when Kea is dropped, TempDir is dropped and cleaned up | ||
| temp_base_directory: TempDir, | ||
| // The config file is stored in this directory | ||
| _temp_base_directory: TempDir, | ||
|
|
||
| // Directories for PID and lock files that AppArmor allows | ||
| pidfile_dir: PathBuf, | ||
| lockfile_dir: PathBuf, | ||
|
|
There was a problem hiding this comment.
pidfile_dir/lockfile_dir are always derived from the temp dir and never diverge from _temp_base_directory.path(). This adds state to keep in sync and forces the TempDir field to be prefixed with _ just to silence "field is never read". Consider removing pidfile_dir/lockfile_dir and using self.temp_base_directory.path() directly in run() (or keep them but also read _temp_base_directory so the extra fields aren’t needed).
1536c6f to
0de1af9
Compare
|
@pbreton make sure to sign your commits to make tests run automatically |
|
/ok to test 0de1af9 |
kensimon
left a comment
There was a problem hiding this comment.
(I'm going to ignore the copilot noise and just review this... I'm not sure why we get copilot chiming in on these PR's, is it something you set up?)
I don't think we need to store a separate pidfile dir and lockfile dir... the real fix here seems to be the fact that you're specifying the the (unique to this test run) temp dir as the KEA_PIDFILE_DIR and KEA_LOCKFILE_DIR env vars. But you don't need to actually store them in the struct... you can just set each env var to the same path (since that's what you're doing here anyway).
You can add a helper with:
impl Kea {
// ... snip ...
fn config_file_path(&self) -> PathBuf {
self.temp_base_directory.path().join("kea-dhcp4.conf")
}
}
The Kea struct can become:
pub struct Kea {
dhcp_in_port: u16,
dhcp_out_port: u16,
process: Option<Child>,
// Hold this around so that when Kea is dropped, TempDir is dropped and cleaned up
// The config file is stored in this directory
temp_base_directory: TempDir,
}
Kea::new() can become:
let temp_base_directory = tempfile::tempdir()?;
let kea = Kea {
temp_base_directory,
dhcp_in_port,
dhcp_out_port,
process: None,
};
std::fs::write(
kea.config_file_path(),
Kea::config(api_server_url).as_bytes(),
)?;
Ok(kea)
Then you run kea with:
.env("KEA_PIDFILE_DIR", &self.temp_base_directory.path())
.env("KEA_LOCKFILE_DIR", &self.temp_base_directory.path())
.arg("-c")
.arg(self.config_file_path())
0de1af9 to
479c955
Compare
|
This PR is now a no-op, just refactoring based on Ken's comment: initial conditions are no longer present. |
479c955 to
b72983d
Compare
Signed-off-by: Patrice Breton <pbreton@nvidia.com>
b72983d to
b41fc6f
Compare
Problem
If running
cargo test -p carbide-dhcp --test booturlyou can get:Error: "KEA STDOUT: kea-dhcp4 already running? Daemon::createPIDFile: PID: 3788736 exists, PID file: /tmp/kea-dhcp4.kea-dhcp4.pid"Root Cause
Tests run in parallel use the same PID file.
Solution
Ensure each test uses it's own PID file.
Notes
sudo aa-complain /usr/sbin/kea-dhcp4