Skip to content

Conversation

@yanglei-rh
Copy link
Contributor

@yanglei-rh yanglei-rh commented Nov 24, 2025

Automate VIRT-95967 Test nic driver load/unload

ID:LIBVIRTAT-22161
Signed-off-by: Lei Yang leiyang@redhat.com

Summary by CodeRabbit

Release Notes

  • Tests
    • Added test configuration and implementation for validating NIC driver unload/load operations with concurrent multi-session file transfers across guest and host systems, including file integrity verification.

✏️ Tip: You can customize this high-level summary in your review settings.

Automate VIRT-95967 Test nic drievr load/unload

Signed-off-by: Lei Yang <leiyang@redhat.com>
@coderabbitai
Copy link

coderabbitai bot commented Nov 24, 2025

Walkthrough

This pull request introduces a new libvirt QEMU NIC driver unload test to verify driver stability during active file transfers. It adds a test configuration file defining the test parameters (Linux-only, filesize 512, 5 concurrent sessions, 1000-second timeout) and a Python test script that orchestrates multi-session file transfers between host and guest, repeatedly unloading and reloading the NIC driver during transfers, then verifying file integrity via MD5 checksums.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Thread synchronization logic: Review the thread lifecycle management (all_threads_done, all_threads_alive) and concurrent file transfer coordination to ensure proper synchronization
  • Error handling paths: Verify cleanup is performed correctly in exception scenarios, particularly around guest session termination and temporary file removal
  • MD5 verification logic: Confirm file integrity checking is accurate across host-to-guest and guest-to-host transfers
  • Udev rule handling: Ensure the guest udev rule preparation and reboot sequence is robust and handles edge cases
  • Driver unload/reload timing: Verify the driver cycling doesn't cause race conditions or deadlocks with active transfers

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a new test case for NIC driver load/unload functionality. It aligns with the changeset which introduces both a configuration file and test script for this feature.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
libvirt/tests/src/virtual_network/qemu/nicdriver_unload.py (4)

25-32: Consider passing vm as a parameter for clarity.

The function accesses vm from the outer scope rather than receiving it as a parameter. While this works, passing it explicitly would make the dependency clearer and improve testability.

Apply this diff:

-def reset_guest_udevrules(session, rules_file, rules_content):
+def reset_guest_udevrules(vm, session, rules_file, rules_content):
     """
     Write guest udev rules, then reboot the guest and
     return the new session

And update the call site at line 70:

-    session = reset_guest_udevrules(session, udev_rules_file, rules)
+    session = reset_guest_udevrules(vm, session, udev_rules_file, rules)

34-43: Simplify the control flow.

The else: continue clause is redundant since the function returns in the if-block.

Apply this diff:

 def all_threads_done(threads):
     """
     Check whether all threads have finished
     """
     for thread in threads:
         if thread.is_alive():
             return False
-        else:
-            continue
     return True

45-54: Simplify the control flow.

The else: continue clause is redundant since the function returns in the if-block.

Apply this diff:

 def all_threads_alive(threads):
     """
     Check whether all threads is alive
     """
     for thread in threads:
         if not thread.is_alive():
             return False
-        else:
-            continue
     return True

162-167: Consider using process.run() instead of process.system() with shell=True.

While the static analysis warning about shell=True is a false positive in this test context (no untrusted input), using process.run() would be more consistent with the rest of the codebase and avoid the shell invocation.

Apply this diff:

         for copied_file in host_file_paths:
-            if process.system(
-                "md5sum %s | grep %s" % (copied_file, file_checksum), shell=True
-            ):
+            cmd = "md5sum %s | grep %s" % (copied_file, file_checksum)
+            if process.run(cmd, ignore_status=True, shell=True).exit_status:
                 err_msg = "Host file %s md5sum changed"
                 err_info.append(err_msg % copied_file)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a302a6f and 1f52310.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/virtual_network/qemu/nicdriver_unload.cfg (1 hunks)
  • libvirt/tests/src/virtual_network/qemu/nicdriver_unload.py (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: hholoubk
Repo: autotest/tp-libvirt PR: 6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.
📚 Learning: 2025-09-24T08:01:27.899Z
Learnt from: hholoubk
Repo: autotest/tp-libvirt PR: 6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Applied to files:

  • libvirt/tests/src/virtual_network/qemu/nicdriver_unload.py
🪛 Ruff (0.14.5)
libvirt/tests/src/virtual_network/qemu/nicdriver_unload.py

143-143: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


163-163: Function call with shell=True parameter identified, security issue

(S604)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Python 3.12
  • GitHub Check: Python 3.8
  • GitHub Check: Python 3.11
  • GitHub Check: Python 3.9
🔇 Additional comments (5)
libvirt/tests/cfg/virtual_network/qemu/nicdriver_unload.cfg (1)

1-9: LGTM!

The test configuration is well-structured with appropriate parameters for testing NIC driver stability under load. The timeout and session count values provide good stress-test coverage.

libvirt/tests/src/virtual_network/qemu/nicdriver_unload.py (4)

1-6: LGTM!

Import statements are appropriate for the test requirements.


108-134: LGTM!

Thread creation and initialization logic is well-structured. The wait for all threads to start (line 133) provides proper synchronization before beginning the driver unload/load cycle.


135-150: LGTM!

The driver unload/load cycle with randomized sleep intervals effectively stress-tests driver stability during active transfers. The use of random.randint (flagged by static analysis) is appropriate here for introducing timing variations in a test scenario—cryptographic randomness is not required.


174-184: LGTM!

Cleanup logic properly joins all threads, removes temporary files from both host and guest, and closes sessions. The use of the else-block ensures cleanup only runs on successful completion.

Comment on lines +66 to +70
if not session.cmd_status("[ -e %s ]" % udev_rules_file):
if not rules:
test.cancel("You must set udev rules before test")
rules = rules % vm_mac_address
session = reset_guest_udevrules(session, udev_rules_file, rules)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the udev rules file existence check logic.

The condition if not session.cmd_status("[ -e %s ]") evaluates to True when the file EXISTS (cmd_status returns 0). The current logic sets rules when the file already exists, but does nothing when the file is missing. This seems inverted.

Expected behavior:

  • If file is missing and rules not provided → cancel test
  • If file is missing and rules provided → create file with rules and reboot
  • If file exists → use existing rules (no action needed)

Run the following script to verify the intended behavior:


🏁 Script executed:

#!/bin/bash
# Check similar udev rules handling patterns in the codebase
rg -n -C3 "70-persistent-net.rules" --type=py

Length of output: 728


The web search didn't find the specific function. Let me search for it directly in the codebase.


🏁 Script executed:

#!/bin/bash
# Find reset_guest_udevrules function definition
rg -n "def reset_guest_udevrules" --type=py

Length of output: 132


🏁 Script executed:

#!/bin/bash
# View the reset_guest_udevrules function and surrounding context
sed -n '25,55p' libvirt/tests/src/virtual_network/qemu/nicdriver_unload.py

Length of output: 843


🏁 Script executed:

#!/bin/bash
# View the full test setup flow including what happens after line 70
sed -n '56,85p' libvirt/tests/src/virtual_network/qemu/nicdriver_unload.py

Length of output: 1474


Fix the inverted file existence check logic.

The condition on line 66 is backwards. if not session.cmd_status("[ -e %s ]") executes when the file exists, but the test should set up rules when the file is missing. Remove the not operator:

if session.cmd_status("[ -e %s ]" % udev_rules_file):

This way, the block executes only when the file is missing, and the test can set required rules before proceeding.

🤖 Prompt for AI Agents
In libvirt/tests/src/virtual_network/qemu/nicdriver_unload.py around lines 66 to
70, the file-existence check is inverted: the current condition uses "if not
session.cmd_status('[ -e %s ]' % udev_rules_file)" which runs when the file
exists; change it to "if session.cmd_status('[ -e %s ]' % udev_rules_file)" so
the block runs only when the udev rules file is missing, then keep the existing
logic to require rules, format rules with the VM MAC address, and call
reset_guest_udevrules to create/setup the file.

# get ethernet driver from '/sys' directory.
# ethtool can do the same thing and doesn't care about os type.
# if we make sure all guests have ethtool, we can make a change here.
sys_path = params.get("sys_path") % (ethname)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Missing required parameter sys_path in configuration.

Line 78 retrieves sys_path from params but this parameter is not defined in the configuration file. This will cause the test to fail with a TypeError when attempting string formatting.

Add the sys_path parameter to libvirt/tests/cfg/virtual_network/qemu/nicdriver_unload.cfg:

     rules = 'SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="%s", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"'
     transfer_timeout = 1000
     sessions_num = 5
+    sys_path = "/sys/class/net/%s/device/driver"

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +102 to +103
if session.cmd_status("md5sum %s | grep %s" % (guest_path, file_checksum)):
test.cancel("File MD5SUMs changed after copy to guest")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use test.error() instead of test.cancel() for file integrity failure.

When MD5 verification fails after copying the file to the guest, it indicates a real problem with the transfer mechanism, not a missing test requirement. Use test.error() to report this as a test setup failure rather than test.cancel().

Apply this diff:

-    if session.cmd_status("md5sum %s | grep %s" % (guest_path, file_checksum)):
-        test.cancel("File MD5SUMs changed after copy to guest")
+    if session.cmd_status("md5sum %s | grep %s" % (guest_path, file_checksum)):
+        test.error("File MD5SUMs changed after copy to guest")
🤖 Prompt for AI Agents
In libvirt/tests/src/virtual_network/qemu/nicdriver_unload.py around lines
102-103, the code calls test.cancel when the MD5 checksum verification fails;
change this to test.error to report a real test failure. Replace the
test.cancel("File MD5SUMs changed after copy to guest") call with
test.error("File MD5SUMs changed after copy to guest") (optionally include the
checksum or command output for context) so the failure is reported as an error
rather than a cancellation.

Comment on lines +170 to +173
except Exception:
for thread in threads:
thread.join(suppress_exception=True)
raise
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix the exception handling indentation.

The raise statement at line 173 is inside the for-loop, which means only the first thread will be joined before re-raising the exception. All threads should be joined before re-raising.

Apply this diff:

     except Exception:
         for thread in threads:
             thread.join(suppress_exception=True)
-            raise
+        raise
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception:
for thread in threads:
thread.join(suppress_exception=True)
raise
except Exception:
for thread in threads:
thread.join(suppress_exception=True)
raise
🤖 Prompt for AI Agents
In libvirt/tests/src/virtual_network/qemu/nicdriver_unload.py around lines 170
to 173, the exception handler currently re-raises inside the for-loop causing
only the first thread to be joined; modify the handler so that all threads are
joined first and then the exception is re-raised (i.e., move the raise outside
the for-loop), ensuring thread.join(suppress_exception=True) runs for every
thread before re-raising the caught exception.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants