Skip to content

Conversation

@meinaLi
Copy link
Contributor

@meinaLi meinaLi commented Nov 17, 2025

Automate case:
RHEL-203952 - Blockcopy disk to nvme destination with luks(with usage)
RHEL-241215 - Blockcopy disk to nvme destination without luks

Summary by CodeRabbit

  • New Features

    • Added NVMe blockcopy test variant (coldplug-only) with default blockcopy options and finish/pivot workflows.
    • Added attach/detach NVMe test flow (hotplug support).
    • Optional LUKS encryption variants for NVMe blockcopy (encrypted vs non-encrypted).
  • Tests

    • Expanded NVMe test coverage to include attach/detach and blockcopy flows with end-to-end verification and cleanup.
    • Updated disk driver/format handling to cover raw and qcow2 scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 17, 2025

Walkthrough

Adds a new blockcopy_nvme variant to the NVMe virtual disk config and modifies existing variants (adds with_attach_detach = "yes"). The test module is refactored: new imports and multiple helper functions were introduced (NVMe PCI discovery, partition/format/LUKS preparation, disk XML construction, attach/detach and blockcopy workflows, and verification). The main run() now routes to either attach/detach or blockcopy flows based on with_attach_detach or with_blockcopy flags. The code also adds LUKS handling and secret lifecycle management for blockcopy scenarios.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • One config file and one test module with substantial new logic and control flow; heterogeneous changes.

Areas to review closely:

  • NVMe PCI discovery and selection logic (race conditions, address parsing)
  • Partitioning, wipefs, GPT creation and cleanup in disk preparation
  • LUKS secret creation, XML representation, and secret undefine/cleanup
  • Disk XML construction (source attrs, driver.type) and attach/define/undefine sequences
  • Blockcopy orchestration and verification of "--finish" vs "--pivot" behavior
  • VM start, login/wait handling, get_new_disks detection and platform-specific delays, and error/cleanup paths

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding two NVMe disk blockcopy test scenarios (with/without LUKS), which is the primary focus of both the config and code changes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
libvirt/tests/src/virtual_disks/virtual_disks_nvme.py (1)

109-169: create_block_nvme_disk: overall logic OK; consider tightening error handling

The NVMe provisioning logic is coherent:

  • lsblk + grep nvme to enumerate candidates, then lsblk <dev> | grep -q 'SWAP\|/' to skip mounted/in‑use devices.
  • parted creates a GPT and a 1G primary partition; nvme_partition_path = nvme_device + "p1" matches nvme naming.
  • Optional LUKS flow creates a libvirt secret and sets its value, then prepare_disk_obj is invoked with disk_type=nvme_disk_type and the partition path.
  • Return tuple (disk_xml, nvme_device, nvme_partition_path, luks_sec_uuid) nicely exposes both XML and paths.

A couple of small points you may want to refine:

  • The except Exception block is very broad; in this test context it’s not dangerous, but if you care about debugging, catching a narrower set (e.g. process.CmdError, xcepts.LibvirtXMLError) or re‑raising after logging would make failures easier to interpret.
  • All shell commands here operate on fixed strings constructed in code (no user input), so shell=True is acceptable for this harness; just keep it in mind if these ever start taking dynamic input.

Functionally, the flow looks sound for the NVMe/LUKS setup.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d0fba1d and dd3c128.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/virtual_disks/virtual_disks_nvme.cfg (2 hunks)
  • libvirt/tests/src/virtual_disks/virtual_disks_nvme.py (6 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-24T08:01:27.899Z
Learnt from: hholoubk
Repo: autotest/tp-libvirt PR: 6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Applied to files:

  • libvirt/tests/src/virtual_disks/virtual_disks_nvme.py
🧬 Code graph analysis (1)
libvirt/tests/src/virtual_disks/virtual_disks_nvme.py (1)
provider/virtual_disk/disk_base.py (5)
  • prepare_disk_obj (114-250)
  • DiskBase (25-633)
  • add_vm_disk (95-112)
  • cleanup_disk_preparation (268-301)
  • get_source_list (56-93)
🪛 Ruff (0.14.4)
libvirt/tests/src/virtual_disks/virtual_disks_nvme.py

128-128: Function call with shell=True parameter identified, security issue

(S604)


136-136: Function call with shell=True parameter identified, security issue

(S604)


145-145: Function call with shell=True parameter identified, security issue

(S604)


150-150: Function call with shell=True parameter identified, security issue

(S604)


165-165: Consider moving this statement to an else block

(TRY300)


167-167: Do not catch blind exception: Exception

(BLE001)


246-246: Function call with shell=True parameter identified, security issue

(S604)


248-248: Local variable result is assigned to but never used

Remove assignment to unused variable result

(F841)


256-256: Function call with shell=True parameter identified, security issue

(S604)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Python 3.12
  • GitHub Check: Python 3.11
  • GitHub Check: Python 3.9
  • GitHub Check: Python 3.8
🔇 Additional comments (4)
libvirt/tests/src/virtual_disks/virtual_disks_nvme.py (4)

1-1: Imports and extended docstring look consistent with new functionality

The added imports (ast, libvirt_version, libvirt from virttest.utils_test, and disk_base) are all used in the new blockcopy flow and replace unsafe eval usage with ast.literal_eval. The updated run() docstring clearly explains both the attach/detach and new blockcopy scenarios and matches the high‑level steps in the code.

Also applies to: 13-21, 28-41


170-208: run_attach_detach_test refactor preserves original behavior

The extracted run_attach_detach_test() keeps the original attach/detach semantics:

  • Ensures the VM is started to compute old_partitions, then destroys it before reconfiguring XML.
  • Uses get_usable_nvme_pci_address() and create_customized_disk() to build the NVMe device.
  • Handles both coldplug (vmxml.add_device + define) and hotplug (virsh.attach_device --live).
  • Waits for exactly one new disk and validates I/O with libvirt_disk.check_virtual_disk_io.
  • Detaches the device via virsh.detach_device.

The refactor improves readability without changing logic; no issues seen.


295-323: Top-level run() flow and cleanup look good

The main control flow:

  • Backs up inactive VM XML (vmxml_backup) and works on a copy.
  • Verifies required host tooling (lspci) before proceeding.
  • Chooses between run_blockcopy_test() and run_attach_detach_test() based on with_blockcopy.
  • Catches VMStartError and LibvirtXMLError and fails the test with clear messages.
  • In finally, ensures the VM is destroyed if still running and restores the original XML via vmxml_backup.sync().

Given that run_blockcopy_test() and run_attach_detach_test() also handle their own short‑lived VM lifecycles as needed, this outer cleanup is a reasonable safety net and should restore the environment reliably after each test.


262-294: Update docstring to reflect actual parameter type; consider guarding [0] index access on get_source_list result

The verification helper logic is sound. After checking the codebase:

  1. Docstring parameter type mismatch — Confirmed

    The parameter docstring says nvme_dest_disk: NVME destination disk XML object, but the function is called with a string path (nvme_partition_path at line 252) and used as a path in the comparison. Update:

  •    :param nvme_dest_disk: NVME destination disk XML object used in blockcopy
    
  •    :param nvme_dest_disk: NVME destination disk path used in blockcopy
    
    
    
  1. Unguarded [0] indexing on get_source_list result — Confirmed

    Lines 282 and 289 index directly without verifying the list is non-empty. While you check len(disk_sources) < 2 on line 278, that uses get_disk_source(), not get_source_list(). If these methods diverge, an IndexError could occur. Add a guard:

    source_list = disk_obj.get_source_list(current_vmxml, disk_type, target_dev)
    if not source_list:
        test.fail("No disk source found for target device...")
    current_disk_source = source_list[0]
  2. get_disk_source(vm_name) on instance — Requires verification

    Line 276 calls current_vmxml.get_disk_source(vm_name) (instance method with vm_name parameter). While this pattern appears elsewhere in the codebase, please verify the method signature is correct; the external library definition is not in this repository.

@meinaLi meinaLi force-pushed the disk_nvme branch 2 times, most recently from 3a6018b to b3ff3e8 Compare November 17, 2025 07:01
@meinaLi
Copy link
Contributor Author

meinaLi commented Nov 17, 2025

# avocado run --vt-type libvirt --vt-omit-data-loss --vt-machine-type q35 virtual_disks.nvme
JOB ID     : 07c3e5d592ed8b38023b8c8bb979dc29636dddd3
JOB LOG    : /var/log/avocado/job-results/job-2025-11-17T01.55-07c3e5d/job.log
 (1/6) type_specific.io-github-autotest-libvirt.virtual_disks.nvme.coldplug.attach_nvme: STARTED
 (1/6) type_specific.io-github-autotest-libvirt.virtual_disks.nvme.coldplug.attach_nvme: PASS (83.29 s)
 (2/6) type_specific.io-github-autotest-libvirt.virtual_disks.nvme.coldplug.blockcopy_nvme.without_luks.finish: STARTED
 (2/6) type_specific.io-github-autotest-libvirt.virtual_disks.nvme.coldplug.blockcopy_nvme.without_luks.finish: PASS (53.91 s)
 (3/6) type_specific.io-github-autotest-libvirt.virtual_disks.nvme.coldplug.blockcopy_nvme.without_luks.pivot: STARTED
 (3/6) type_specific.io-github-autotest-libvirt.virtual_disks.nvme.coldplug.blockcopy_nvme.without_luks.pivot: PASS (74.15 s)
 (4/6) type_specific.io-github-autotest-libvirt.virtual_disks.nvme.coldplug.blockcopy_nvme.with_luks.finish: STARTED
 (4/6) type_specific.io-github-autotest-libvirt.virtual_disks.nvme.coldplug.blockcopy_nvme.with_luks.finish: PASS (60.65 s)
 (5/6) type_specific.io-github-autotest-libvirt.virtual_disks.nvme.coldplug.blockcopy_nvme.with_luks.pivot: STARTED
 (5/6) type_specific.io-github-autotest-libvirt.virtual_disks.nvme.coldplug.blockcopy_nvme.with_luks.pivot: PASS (56.56 s)
 (6/6) type_specific.io-github-autotest-libvirt.virtual_disks.nvme.hotplug.attach_nvme: STARTED
 (6/6) type_specific.io-github-autotest-libvirt.virtual_disks.nvme.hotplug.attach_nvme: PASS (93.23 s)
RESULTS    : PASS 6 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB HTML   : /var/log/avocado/job-results/job-2025-11-17T01.55-07c3e5d/results.html
JOB TIME   : 428.25 s

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
libvirt/tests/src/virtual_disks/virtual_disks_nvme.py (1)

42-61: Minor clarity and efficiency tweaks (docstring, helper reuse)

A few small refinements to make the test easier to maintain:

  • verify_blockcopy_result’s docstring currently states nvme_dest_disk is an XML object; in reality the caller passes the NVMe destination path. Updating the docstring (and possibly renaming the parameter to nvme_dest_path) would better match actual usage.

  • In run_attach_detach_test, get_new_disks(old_partitions) is evaluated twice in succession:

    utils_misc.wait_for(lambda: get_new_disks(old_partitions), 20)
    new_disks = get_new_disks(old_partitions)

    You could avoid the second guest login by having wait_for return the new partitions (e.g., store the result it returns or wrap it in a small helper), or by caching inside get_new_disks if that fits your style.

  • Overall branching on with_blockcopy at the end of run() is clear; once the LUKS and NVMe selection issues are addressed, this split between attach/detach and blockcopy flows is straightforward.

These are non-blocking, but would make the test logic a bit cleaner.

Also applies to: 170-201, 261-268, 294-313

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a6018b and b3ff3e8.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/virtual_disks/virtual_disks_nvme.cfg (2 hunks)
  • libvirt/tests/src/virtual_disks/virtual_disks_nvme.py (6 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-24T06:35:25.679Z
Learnt from: meinaLi
Repo: autotest/tp-libvirt PR: 6628
File: libvirt/tests/src/virtual_disks/virtual_disks_io_error.py:128-129
Timestamp: 2025-10-24T06:35:25.679Z
Learning: In libvirt tests, `virsh.snapshot_delete()` without the `--metadata` flag automatically deletes both snapshot metadata and external disk files. Explicit `os.remove()` is only needed when using `virsh.snapshot_delete()` with the `--metadata` flag, which removes only metadata.

Applied to files:

  • libvirt/tests/src/virtual_disks/virtual_disks_nvme.py
📚 Learning: 2025-09-24T08:01:27.899Z
Learnt from: hholoubk
Repo: autotest/tp-libvirt PR: 6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Applied to files:

  • libvirt/tests/src/virtual_disks/virtual_disks_nvme.py
🧬 Code graph analysis (1)
libvirt/tests/src/virtual_disks/virtual_disks_nvme.py (1)
provider/virtual_disk/disk_base.py (4)
  • prepare_disk_obj (114-250)
  • add_vm_disk (95-112)
  • cleanup_disk_preparation (268-301)
  • get_source_list (56-93)
🪛 Ruff (0.14.4)
libvirt/tests/src/virtual_disks/virtual_disks_nvme.py

128-128: Function call with shell=True parameter identified, security issue

(S604)


136-136: Function call with shell=True parameter identified, security issue

(S604)


145-145: Function call with shell=True parameter identified, security issue

(S604)


150-150: Function call with shell=True parameter identified, security issue

(S604)


165-165: Consider moving this statement to an else block

(TRY300)


167-167: Do not catch blind exception: Exception

(BLE001)


246-246: Function call with shell=True parameter identified, security issue

(S604)


256-256: Function call with shell=True parameter identified, security issue

(S604)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Python 3.9
  • GitHub Check: Python 3.8
  • GitHub Check: Python 3.12
  • GitHub Check: Python 3.11
🔇 Additional comments (2)
libvirt/tests/cfg/virtual_disks/virtual_disks_nvme.cfg (1)

6-39: Blockcopy NVMe/LUKS cfg wiring looks consistent with Python implementation

The disk_image_format default and the with_luks override to "qcow2" align with the Python path that creates a qcow2+LUKS NVMe destination, and both disk_dict and nvme_disk_dict correctly pick up ${disk_image_format}. The new blockcopy_nvme / finish / pivot variants and NVMe type separation (disk_type="file", nvme_disk_type="block") look coherent with run_blockcopy_test() and create_block_nvme_disk().

libvirt/tests/src/virtual_disks/virtual_disks_nvme.py (1)

94-101: Safe use of disk_image_format and ast.literal_eval

Using params.get("disk_image_format") and ast.literal_eval(params.get("source_attrs", "{}")) removes the earlier literal_eval TypeError risk and keeps parsing constrained to literal structures. This is a solid, minimal change.

Comment on lines 128 to 139
lsblk_result = process.run("lsblk -d -n -o NAME | grep nvme",
timeout=10, ignore_status=True, shell=True)
if lsblk_result.exit_status != 0:
test.cancel("No NVME device found in host")

# Backup vm xml
vmxml_backup = vm_xml.VMXML.new_from_inactive_dumpxml(vm_name)
vmxml = vmxml_backup.copy()
nvme_devices = lsblk_result.stdout_text.strip().split('\n')
for device in nvme_devices:
device_path = "/dev/" + device.strip()
mount_check = process.run(f"lsblk {device_path} | grep -q SWAP\\|/",
ignore_status=True, shell=True)
if mount_check.exit_status != 0:
nvme_device = device_path
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Fix NVMe “unused” detection to avoid wiping in-use host disks

The logic:

lsblk_result = process.run("lsblk -d -n -o NAME | grep nvme", ...)
...
mount_check = process.run(
    f"lsblk {device_path} | grep -q SWAP\\|/", ignore_status=True, shell=True)
if mount_check.exit_status != 0:
    nvme_device = device_path
    break
...
process.run("sgdisk --zap-all %s" % nvme_device, shell=True, ignore_status=False)

has two coupled problems:

  • grep -q SWAP\|/ under basic grep does not implement alternation; it looks for the literal string SWAP|/, which will never match typical lsblk output. As a result, mount_check.exit_status is effectively always non‑zero, so the first NVMe device is treated as “unused” even if it holds mounted filesystems or swap.
  • The subsequent sgdisk --zap-all on that device can therefore operate on an in-use NVMe disk (e.g., root or data disk), which is extremely destructive.

You should reliably detect any mounted/swap usage and only pick genuinely unused devices (or make the device explicit via params). For example:

-                mount_check = process.run(f"lsblk {device_path} | grep -q SWAP\\|/",
-                                          ignore_status=True, shell=True)
-                if mount_check.exit_status != 0:
+                # Treat any non-empty mountpoint (including SWAP) as "in use"
+                mount_check = process.run(
+                    f"lsblk -n -o MOUNTPOINT {device_path} | grep -qE '\\S'",
+                    ignore_status=True, shell=True)
+                if mount_check.exit_status != 0:
                     nvme_device = device_path
                     break

and keep sgdisk --zap-all restricted to devices that have passed this stricter check (or additionally guard via a whitelist/parameter).

Also applies to: 256-257


I'll help you verify this review comment. Let me examine the actual code in the repository to confirm the issues described.

Now let me examine the actual code in the repository:


---

🏁 Script executed:

```shell
#!/bin/bash
# View the actual code at the specified lines
if [ -d "libvirt" ]; then
  cat libvirt/tests/src/virtual_disks/virtual_disks_nvme.py | sed -n '120,160p'
else
  echo "Repository structure check:"
  find . -name "virtual_disks_nvme.py" -type f 2>/dev/null | head -5
fi

Length of output: 2238


🏁 Script executed:

#!/bin/bash
# Also check lines around 256-257 for similar issues
if [ -f "libvirt/tests/src/virtual_disks/virtual_disks_nvme.py" ]; then
  echo "=== Lines 250-265 ===" 
  sed -n '250,265p' libvirt/tests/src/virtual_disks/virtual_disks_nvme.py
fi

Length of output: 687


🏁 Script executed:

#!/bin/bash
# Search for sgdisk calls in the file
rg -n "sgdisk" libvirt/tests/src/virtual_disks/virtual_disks_nvme.py

Length of output: 93


🏁 Script executed:

#!/bin/bash
# Get context around grep patterns used in the file
rg -n "grep.*SWAP" -A3 -B3 libvirt/tests/src/virtual_disks/virtual_disks_nvme.py

Length of output: 448


🏁 Script executed:

#!/bin/bash
# Check lsblk output to understand what the grep is actually matching against
# Show what lsblk -d -n -o NAME outputs and what lsblk on a device outputs
cat <<'EOF'
=== Understanding lsblk output formats ===
lsblk -d -n -o NAME outputs:
nvme0n1
sda
...

lsblk /dev/nvme0n1 outputs columns like:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT ...
nvme0n1 259:0 0 476.97G 0 disk

With mounted filesystem:
nvme0n1 259:0 0 476.97G 0 disk
nvme0n1p1 259:1 0 512B 0 part /boot
nvme0n1p2 259:2 0 100G 0 part /

With swap:
nvme0n1 259:0 0 476.97G 0 disk
nvme0n1p1 259:1 0 20G 0 part [SWAP]
EOF

Length of output: 408


🏁 Script executed:

#!/bin/bash
# Search for all process.run commands with grep patterns
rg -n 'process\.run.*grep' -A2 -B2 libvirt/tests/src/virtual_disks/virtual_disks_nvme.py | head -60

Length of output: 635


🏁 Script executed:

#!/bin/bash
# Look for the full context of nvme_device assignment and sgdisk usage
ast-grep --pattern 'nvme_device = $_' libvirt/tests/src/virtual_disks/virtual_disks_nvme.py

Length of output: 103


🏁 Script executed:

#!/bin/bash
# Search for sgdisk execution pattern
rg -n "sgdisk.*nvme_device" -B5 -A5 libvirt/tests/src/virtual_disks/virtual_disks_nvme.py

Length of output: 583


🏁 Script executed:

#!/bin/bash
# Get full function context to understand the flow
rg -n "def.*test.*nvme" -A200 libvirt/tests/src/virtual_disks/virtual_disks_nvme.py | head -150

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Search for the loop and device selection logic
rg -n "for device in" -A15 libvirt/tests/src/virtual_disks/virtual_disks_nvme.py

Length of output: 931


🏁 Script executed:

#!/bin/bash
# Final check - see if break statement exists after nvme_device assignment
rg -n "nvme_device = device_path" -A5 libvirt/tests/src/virtual_disks/virtual_disks_nvme.py

Length of output: 266


I need to verify the exact behavior of the grep command with alternation syntax:

Now let me verify the actual behavior of lsblk to understand what the grep is matching against:

Based on my verification, I've discovered a critical distinction: GNU grep allows you to escape | in BRE (basic regular expressions) and get alternation, so grep -q SWAP\|/ will actually work on Linux systems with GNU grep.

However, your review's actual concern about the device selection logic is valid and critical. The real problem isn't the grep syntax—it's the pattern itself:

The mount point will be empty if it is unmounted, and unmounted partitions will have an empty MOUNTPOINT. When running lsblk /dev/nvme0nX, the output shows all partitions nested under that device with their mount points. The current grep pattern SWAP\|/ only detects devices mounted at root or swap—but misses devices mounted at /data, /mnt, /home, or other paths.

This means a device with partitions mounted anywhere except root or swap would be incorrectly classified as "unused" and subsequently destroyed by sgdisk --zap-all.

Fix NVMe device selection to detect all mount points, not just root/swap

Devices can be mounted at arbitrary paths (e.g., /data, /mnt). The current grep pattern only catches / and [SWAP]:

-                mount_check = process.run(f"lsblk {device_path} | grep -q SWAP\\|/",
-                                          ignore_status=True, shell=True)
-                if mount_check.exit_status != 0:
+                # Check for ANY non-empty mountpoint (catches all mount paths, not just / and SWAP)
+                mount_check = process.run(
+                    f"lsblk -n -o MOUNTPOINT {device_path} | grep -qE '\\S'",
+                    ignore_status=True, shell=True)
+                if mount_check.exit_status != 0:
                     nvme_device = device_path
                     break

This reliably detects any partition with a non-empty mountpoint before invoking sgdisk --zap-all.

🧰 Tools
🪛 Ruff (0.14.4)

128-128: Function call with shell=True parameter identified, security issue

(S604)


136-136: Function call with shell=True parameter identified, security issue

(S604)

🤖 Prompt for AI Agents
libvirt/tests/src/virtual_disks/virtual_disks_nvme.py around lines 128-139: the
current mount check uses grep -q SWAP\|/ which only detects root or swap and
will miss partitions mounted at other paths, causing mounted devices to be
treated as unused; replace the grep-based check with a test that detects any
non-empty MOUNTPOINT for the device and its partitions (e.g., query lsblk for
MOUNTPOINTs for the device and consider the device mounted if any line is
non-empty), keep ignore_status/shell handling, and only set nvme_device when no
mounted partitions are found.

if with_blockcopy:
run_blockcopy_test()
else:
run_attach_detach_test()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better we can have condition when call run_attach_detach_test in order to further expand:
if with_blcok:
xx
elif attach_detach_test:
xx
else:
logging.info()

except Exception as err:
test.fail("Failed to create block NVME disk: %s" % str(err))

def run_attach_detach_test():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move this out of scope :runtest(), and put them ahead of runtest(). If you implement in runtest(), those will become method inline code

virsh.detach_device(vm_name, device_obj.xml, flagstr="--live",
debug=True, ignore_status=False)

def run_blockcopy_test():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest move it out of runtest() scope

if with_luks:
virsh.secret_undefine(luks_sec_uuid, debug=True, ignore_status=False)

def verify_blockcopy_result(params, original_source, nvme_dest_disk):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest move out of runtest scope

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reorganized the positions of all the functions, and the tests have passed.

Automate case:
RHEL-203952 - Blockcopy disk to nvme destination with luks(with usage)
RHEL-241215 - Blockcopy disk to nvme destination without luks

Signed-off-by: Meina Li <meili@redhat.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
libvirt/tests/src/virtual_disks/virtual_disks_nvme.py (3)

71-95: ast.literal_eval default handling fix is correct

Using:

source_dict = ast.literal_eval(params.get("source_attrs", "{}"))

is the right way to provide a default, and avoids the previous literal_eval TypeError when source_attrs was missing. Attaching the PCI address via disk_source.address = pci_address on the primitive disk XML also looks correct for hostdev NVMe passthrough.


205-238: Docstring for nvme_dest_disk is misleading (it’s a path, not XML)

The docstring says:

:param nvme_dest_disk: NVME destination disk XML object used in blockcopy

but run_blockcopy_test() passes nvme_partition_path (a string path) and the comparison is:

if current_disk_source != nvme_dest_disk:

Consider updating the docstring (and optionally the parameter name) to reflect the real type:

-    :param nvme_dest_disk: NVME destination disk XML object used in blockcopy
+    :param nvme_dest_disk: NVME destination block device path used in blockcopy

This will save future readers from assuming an XML object here.


241-291: Blockcopy NVMe/LUKS flow and format wiring now look consistent

run_blockcopy_test() now:

  • Uses disk_dict / nvme_disk_dict from cfg, with driver_type set to "qcow2" in the blockcopy_nvme variant.
  • Delegates NVMe partition and disk XML creation to create_block_nvme_disk(), which merges disk_dict with the appropriate luks_sec_dict (with or without encryption).
  • For with_luks, pre-creates a qcow2+LUKS image via qemu-img create -f qcow2 <luks_extra_parameter> <nvme_partition> <size>; for without_luks, relies on blockcopy itself to create the qcow2 image on the block device.
  • Verifies --finish vs --pivot behavior via verify_blockcopy_result() and cleans up with wipefs and optional secret_undefine.

This resolves the earlier qcow2/raw mismatch and the lost-encryption issue in the NVMe destination XML.

🧹 Nitpick comments (5)
libvirt/tests/src/virtual_disks/virtual_disks_nvme.py (5)

25-45: Tighten exception handling in get_new_disks

Current pattern:

except Exception as err:
    raise Exception("Error happens when get new disk: %s" % str(err))
  • Loses the original exception type and traceback.
  • Triggers static checks about blind Exception.

You can preserve context with minimal change:

-    except Exception as err:
-        raise Exception("Error happens when get new disk: %s" % str(err))
+    except Exception as err:
+        raise RuntimeError(f"Error getting new disk: {err}") from err

Optionally, you could also rely on the original exception and just log here, but not strictly required.


48-68: NVMe PCI discovery is fine for “first device” semantics

get_usable_nvme_pci_address() correctly:

  • Uses lspci | grep -i 'Non-Volatile memory controller' to locate NVMe devices.
  • Parses the PCI address into domain/bus/slot/function fields compatible with libvirt XML.
  • Cancels the test early if no NVMe is present.

This is adequate for the current tests that just need “one usable NVMe”. If you ever need to support multiple NVMe devices, consider parameterizing which controller to pick instead of always taking the first.


191-193: Avoid double login in new-disk detection (minor)

Here you call get_new_disks() twice:

utils_misc.wait_for(lambda: get_new_disks(vm, old_partitions), 20)
new_disks = get_new_disks(vm, old_partitions)

Each call performs a full VM login. You can avoid the duplicate login by having wait_for return the list and reuse it:

new_disks = utils_misc.wait_for(
    lambda: get_new_disks(vm, old_partitions) or None, 20)

and then skip the second call. Not critical, but it will make the test a bit lighter and logs less noisy.


55-57: Shell commands are fine for tests but consider guarding inputs

Several new calls use process.run(..., shell=True, ...) for host tools like lspci, lsblk, parted, sfdisk, qemu-img, and wipefs. In this test context:

  • Commands and arguments are static or come from controlled params, so injection risk is low.
  • shell=True is convenient for pipelines (lsblk | grep, etc.).

If you ever need to pass user-influenced values into these commands, consider switching to list-style argv (shell=False) or carefully sanitizing the interpolated pieces to keep things robust and avoid tripping security linters.

Also applies to: 118-119, 126-127, 136-137, 141-142, 178-179, 274-277, 287-288


98-162: Unused parameters in create_block_nvme_disk

The parameters disk_type and disk_obj are not referenced anywhere in the function body and should be removed from the signature. Update the function call in run_blockcopy_test() accordingly.

Python version compatibility for dict union operator

The dict union operator (|) on line 141 requires Python ≥ 3.9. Confirm your CI environment and all supported test hosts meet this requirement. If earlier Python versions must be supported, use the compatible pattern instead:

-        nvme_disk_dict = disk_dict | luks_sec_dict
+        nvme_disk_dict = disk_dict.copy()
+        nvme_disk_dict.update(luks_sec_dict)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3ff3e8 and c65a4d3.

📒 Files selected for processing (2)
  • libvirt/tests/cfg/virtual_disks/virtual_disks_nvme.cfg (2 hunks)
  • libvirt/tests/src/virtual_disks/virtual_disks_nvme.py (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-24T08:01:27.899Z
Learnt from: hholoubk
Repo: autotest/tp-libvirt PR: 6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Applied to files:

  • libvirt/tests/src/virtual_disks/virtual_disks_nvme.py
🪛 Ruff (0.14.7)
libvirt/tests/src/virtual_disks/virtual_disks_nvme.py

40-40: Consider moving this statement to an else block

(TRY300)


41-41: Do not catch blind exception: Exception

(BLE001)


42-42: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


42-42: Create your own exception

(TRY002)


55-55: Function call with shell=True parameter identified, security issue

(S604)


98-98: Unused function argument: disk_type

(ARG001)


98-98: Unused function argument: disk_obj

(ARG001)


118-118: Function call with shell=True parameter identified, security issue

(S604)


126-126: Function call with shell=True parameter identified, security issue

(S604)


136-136: Function call with shell=True parameter identified, security issue

(S604)


141-141: Function call with shell=True parameter identified, security issue

(S604)


158-158: Consider moving this statement to an else block

(TRY300)


160-160: Do not catch blind exception: Exception

(BLE001)


178-178: Function call with shell=True parameter identified, security issue

(S604)


277-277: Function call with shell=True parameter identified, security issue

(S604)


287-287: Function call with shell=True parameter identified, security issue

(S604)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Python 3.11
  • GitHub Check: Python 3.8
  • GitHub Check: Python 3.12
  • GitHub Check: Python 3.9
🔇 Additional comments (2)
libvirt/tests/cfg/virtual_disks/virtual_disks_nvme.cfg (1)

17-40: Blockcopy NVMe variant wiring and LUKS params look consistent with Python helper

The new blockcopy_nvme variant is wired coherently with run_blockcopy_test() / create_block_nvme_disk():

  • disk_type="file" and nvme_disk_type="block" align with how disk_dict and nvme_disk_dict are instantiated.
  • Overriding driver_type = "qcow2" inside this variant ensures the XML driver type matches the qcow2 usage in the blockcopy/LUKS flow, resolving the earlier raw/qcow2 mismatch.
  • luks_sec_dict for without_luks (simple dev source) vs with_luks (adds source.encryption.secret with ${sec_volume}) matches how create_block_nvme_disk() uses params["luks_sec_dict"] % nvme_partition_path and merges it into the final disk dict.

This configuration should exercise both with- and without-LUKS blockcopy paths correctly.

libvirt/tests/src/virtual_disks/virtual_disks_nvme.py (1)

293-342: Run-mode dispatch and error handling are clear

The main run():

  • Extracts with_blockcopy and with_attach_detach flags and dispatches to run_blockcopy_test() or run_attach_detach_test() accordingly, cancelling when neither is set.
  • Ensures lspci is available before NVMe operations.
  • Wraps the body to convert VMStartError and LibvirtXMLError into clear test.fail() messages.
  • Restores the original VM XML and destroys any running VM in finally, which is consistent with tp-libvirt patterns (teardown is still respected). Based on learnings, this explicit cleanup is acceptable outside the specialized IOMMU helper teardown.

This control flow looks good and matches the new cfg variants.

Comment on lines +164 to +203
def run_attach_detach_test(test, params, vm, vmxml, vm_name, hotplug):
"""
Run original attach/detach test with nvme device
"""
if vm.is_dead():
vm.start()
session = vm.wait_for_login()
old_partitions = utils_disk.get_parts_list(session)
session.close()
vm.destroy(gracefully=False)

pci_addr_in_dict = get_usable_nvme_pci_address(test)

# Delete partitions if exist on nvme device
process.run("sfdisk --delete /dev/nvme[0-9]n[0-9]",
timeout=10, ignore_status=True, verbose=True, shell=True)

device_obj = create_customized_disk(params, pci_addr_in_dict)
if not hotplug:
vmxml.add_device(device_obj)
vmxml.sync()
vm.start()
vm.wait_for_login().close()
if hotplug:
virsh.attach_device(vm_name, device_obj.xml, flagstr="--live",
ignore_status=False, debug=True)

utils_misc.wait_for(lambda: get_new_disks(vm, old_partitions), 20)
new_disks = get_new_disks(vm, old_partitions)
if len(new_disks) != 1:
test.fail("Attached 1 virtual disk but got %s." % len(new_disks))
new_disk = new_disks[0]
if platform.platform().count('ppc64'):
time.sleep(10)
if not libvirt_disk.check_virtual_disk_io(vm, new_disk):
test.fail("Cannot operate the newly added disk in vm.")

virsh.detach_device(vm_name, device_obj.xml, flagstr="--live",
debug=True, ignore_status=False)

Copy link

@coderabbitai coderabbitai bot Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Avoid sfdisk --delete /dev/nvme[0-9]n[0-9] wildcard to protect host disks

This call:

process.run("sfdisk --delete  /dev/nvme[0-9]n[0-9]",
            timeout=10, ignore_status=True, verbose=True, shell=True)

attempts to delete all /dev/nvmeXnY partition tables on the host, regardless of whether they are in use. On systems where root or data volumes live on NVMe, this is extremely dangerous and can corrupt the host.

You already implemented a safe NVMe selection in create_block_nvme_disk() using lsblk + MOUNTPOINT checks. Reuse the same idea here and operate only on a single, verified-unused device, e.g.:

-    # Delete partitions if exist on nvme device
-    process.run("sfdisk --delete  /dev/nvme[0-9]n[0-9]",
-                timeout=10, ignore_status=True, verbose=True, shell=True)
+    # Delete partitions only on an explicitly selected, unmounted NVMe device
+    # (similar logic as in create_block_nvme_disk)
+    lsblk_result = process.run(
+        "lsblk -d -n -o NAME | grep nvme",
+        timeout=10, ignore_status=True, shell=True)
+    if lsblk_result.exit_status != 0:
+        test.cancel("No NVME device found in host for attach/detach test")
+    for device in lsblk_result.stdout_text.strip().splitlines():
+        device_path = "/dev/" + device.strip()
+        mount_check = process.run(
+            f"lsblk -n -o MOUNTPOINT {device_path} | grep -qE '\\S'",
+            ignore_status=True, shell=True)
+        if mount_check.exit_status != 0:
+            process.run(f"sfdisk --delete {device_path}",
+                        timeout=10, ignore_status=False, verbose=True, shell=True)
+            break
+    else:
+        test.cancel("No available unmounted NVME device found for attach/detach test")

This keeps the test behavior while avoiding wiping unrelated NVMe devices.

🧰 Tools
🪛 Ruff (0.14.7)

178-178: Function call with shell=True parameter identified, security issue

(S604)

🤖 Prompt for AI Agents
In libvirt/tests/src/virtual_disks/virtual_disks_nvme.py around lines 164 to
203, the current call uses a wildcard "sfdisk --delete /dev/nvme[0-9]n[0-9]"
which can wipe all host NVMe partition tables; instead, determine the single
target NVMe device chosen for the test (reuse the earlier
get_usable_nvme_pci_address()/create_block_nvme_disk() logic or use
lsblk+MOUNTPOINT to verify the device is unmounted and unused), construct the
explicit partition device path (e.g. /dev/nvmeXnY) for that device only, and
call sfdisk --delete on that specific path (avoid shell=True with a glob and
prefer passing the explicit path to process.run) so only the verified-unused
device is modified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to our current test environment, it won't affect our host. So I think we can keep the above usage in this test script.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@meinaLi, I understand and respect your decision. Since your test environment is controlled and you're confident this won't affect the host, that makes sense for your use case.

For future maintainers who might run this test in different environments, you could optionally add a brief comment above that line explaining the assumption, something like:

# Safe in controlled test environment where NVMe devices are dedicated to testing
process.run("sfdisk --delete  /dev/nvme[0-9]n[0-9]",
            timeout=10, ignore_status=True, verbose=True, shell=True)

But this is entirely optional—feel free to proceed as you see fit. Thank you for the clarification!


🧠 Learnings used
Learnt from: meinaLi
Repo: autotest/tp-libvirt PR: 6628
File: libvirt/tests/src/virtual_disks/virtual_disks_io_error.py:128-129
Timestamp: 2025-10-24T06:35:25.679Z
Learning: In libvirt tests, `virsh.snapshot_delete()` without the `--metadata` flag automatically deletes both snapshot metadata and external disk files. Explicit `os.remove()` is only needed when using `virsh.snapshot_delete()` with the `--metadata` flag, which removes only metadata.

@meinaLi meinaLi requested a review from chunfuwen December 4, 2025 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants