Skip to content

Conversation

@Lankou66
Copy link

@Lankou66 Lankou66 commented Nov 12, 2025

Implement recursive CBT deactivation for VDI snapshot chains

Add recursive CBT deactivation for VDI and all child snapshots Delete CBT log files when present
Update parent/child references in CBT metadata

Fixes: Incomplete CBT deactivation on snapshot chains

stormi and others added 30 commits August 28, 2025 17:18
This was a patch added to the sm RPM git repo before we had this
forked git repo for sm in the xcp-ng github organisation.
Originally-by: Ronan Abhamon <ronan.abhamon@vates.fr>

This version obtained through merge in
ff1bf65:

 git restore -SW -s ydi/forks/2.30.7/xfs drivers/EXTSR.py
 mv drivers/EXTSR.py drivers/XFSSR.py
 git restore -SW drivers/EXTSR.py

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
Some important points:

- linstor.KV must use an identifier name that starts with a letter (so it uses a "sr-" prefix).

- Encrypted VDI are supported with key_hash attribute (not tested, experimental).

- When a new LINSTOR volume is created on a host (via snapshot or create), the remaining diskless
devices are not necessarily created on other hosts. So if a resource definition exists without
local device path, we ask it to LINSTOR. Wait 5s for symlink creation when a new volume
is created => 5s is is purely arbitrary, but this guarantees that we do not try to access the
volume if the symlink has not yet been created by the udev rule.

- Can change the provisioning using the device config 'provisioning' param.

- We can only increase volume size (See: LINBIT/linstor-server#66),
it would be great if we could shrink volumes to limit the space used by the snapshots.

- Inflate/Deflate can only be executed on the master host, a linstor-manager plugin is present
to do this from slaves. The same plugin is used to open LINSTOR ports + start controller.

- Use a `total_allocated_volume_size` method to have a good idea of the reserved memory
Why? Because `physical_free_size` is computed using the LVM used size, in the case of thick provisioning it's ok,
but when thin provisioning is choosen LVM returns only the allocated size using the used block count. So this method
solves this problem, it takes the fixed virtual volume size of each node to compute the required size to store the
volume data.

- Call vhd-util on remote hosts using the linstor-manager when necessary, i.e. vhd-util is called to get vhd info,
the DRBD device can be in use (and unusable by external processes), so we must use the local LVM device that
contains the DRBD data or a remote disk if the DRBD device is diskless.

- If a DRBD device is in use when vhdutil.getVHDInfo is called, we must have no
errors. So a LinstorVhdUtil wrapper is now used to bypass DRBD layer when
VDIs are loaded.

- Refresh PhyLink when unpause in called on DRBD devices:
We must always recreate the symlink to ensure we have
the right info. Why? Because if the volume UUID is changed in
LINSTOR the symlink is not directly updated. When live leaf
coalesce is executed we have these steps:
"A" -> "OLD_A"
"B" -> "A"
Without symlink update the previous "A" path is reused instead of
"B" path. Note: "A", "B" and "OLD_A" are UUIDs.

- Since linstor python modules are not present on every XCP-ng host,
module imports are protected by try.. except... blocks.

- Provide a linstor-monitor daemon to check master changes
- Check if "create" doesn't succeed without zfs packages
- Check if "scan" failed if the path is not mounted (not a ZFS mountpoint)
Co-authored-by: Piotr Robert Konopelko <piotr.konopelko@moosefs.pro>
Signed-off-by: Aleksander Wieliczko <aleksander.wieliczko@moosefs.pro>
Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
`umount` should not be called when `legacy_mode` is enabled, otherwise a mounted dir
used during SR creation is unmounted at the end of the `create` call (and also
when a PBD is unplugged) in `detach` block.

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
A sm-config boolean param `subdir` is available to configure where to store the VHDs:
- In a subdir with the SR UUID, the new behavior
- In the root directory of the MooseFS SR

By default, new SRs are created with `subdir` = True.
Existing SRs  are not modified and continue to use the folder that was given at
SR creation, directly, without looking for a subdirectory.

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Ensure all shared drivers are imported in `_is_open` definition to register
them in the driver list. Otherwise this function always fails with a SRUnknownType exception.

Also, we must add two fake mandatory parameters to make MooseFS happy: `masterhost` and `rootpath`.
Same for CephFS with: `serverpath`. (NFS driver is directly patched to ensure there is no usage of
the `serverpath` param because its value is equal to None.)

`location` param is required to use ZFS, to be more precise, in the parent class: `FileSR`.

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
SR_CACHING offers the capacity to use IntelliCache, but this
feature is only available using NFS SR.

For more details, the implementation of `_setup_cache` in blktap2.py
uses only an instance of NFSFileVDI for the shared target.

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
* `except` syntax fixes
* drop `has_key()` usage
* drop `filter()` usage (but drop their silly `list(x.keys())` wrappings)
* drop `map()` usage
* use `int` not `long`
* use `items()` not `iteritems()`

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
…store

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
Guided by futurize's "old_div" use

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
PROBE_MOUNTPOINT in a some drivers is a relative path, which is resolved
using MOUNT_BASE at probe time, but CephFS, GlusterFS and MooseFS it is
set on driver load to an absolute path, and this requires MOUNT_BASE to be
looking like a path component.

```
drivers/CephFSSR.py:69: in <module>
    PROBE_MOUNTPOINT = os.path.join(SR.MOUNT_BASE, "probe")
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

a = <MagicMock name='mock.MOUNT_BASE' id='140396863897728'>, p = ('probe',)

    def join(a, *p):
        """Join two or more pathname components, inserting '/' as needed.
        If any component is an absolute path, all previous path components
        will be discarded.  An empty last part will result in a path that
        ends with a separator."""
>       a = os.fspath(a)
E       TypeError: expected str, bytes or os.PathLike object, not MagicMock

/usr/lib64/python3.6/posixpath.py:80: TypeError
```

Note this same idiom is also used in upstream SMBFS, although that does not
appear to cause any problem with the tests.

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
(coverage 7.2.5)

Without these changes many warns/errors are emitted:
- "assertEquals" is deprecated, "assertEqual" must be used instead
- mocked objects in "setUp" method like "cleanup.IPCFlag" cannot be repatched
  at the level of the test functions, otherwise tests are aborted,
  this is the  behavior of coverage version 7.2.5

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
The probe method is not implemented so we
shouldn't advertise it.

Signed-off-by: BenjiReis <benjamin.reis@vates.fr>
Impacted drivers: LINSTOR, MooseFS and ZFS.
- Ignore all linstor.* members during coverage,
  the module is not installed in github runner.
- Use mock from unittest, the old one is not found now.
- Remove useless return from LinstorSR scan method.

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Signed-off-by: BenjiReis <benjamin.reis@vates.fr>
)

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
A SR inheriting from a EXTSR allowing to use a 4KiB blocksize device as
SR.
Create a 512 bytes loop device on top of a 4KiB device then give it to
EXTSR code.
It uses the same device-config as a normal local SR, i.e.
`device-config:device=/dev/nvme0n1`

After creation, the driver find the device under the VG to identify the
correct disk.
It means that creating the SR with a non-stable disk identifier is
doable and it will work as EXTSR would by ignoring the device-config
after creation.
Identifying the correct disk by using LVM infos.

The VG is created using a different prefix name from EXTSR.
It is `XSLocalLargeBlock-<SR UUID>`.

The SR artificially limits the creation to disk not being 512b.
It will throw an error if a disk whose blocksize is 512 is given.

We currently don't support multi devices, it fails at the EXTSR creation.
We added an error to explicitly say that multi devices SR is not supported
on the driver.
Before that, it would make another error:
```
Error code: SR_BACKEND_FAILURE_77
Error parameters: , Logical Volume group creation failed,
```
Sometimes the pvremove from EXTSR using the loop device fails.
In this case, we need to remove the real device from PV list ourself in
the error handling.

Signed-off-by: Damien Thenot <damien.thenot@vates.tech>
With this change the driver supports a "lvm-conf" param on "other-config".
For now The configuration is only used by "remove" calls from LVMCache.

Example to issue discards after a lvremove command:

> xe sr-param-set uuid=<SR_UUID> other-config:lvm-conf=issue_discards=1

And to remove the param:

> xe sr-param-remove uuid=<SR_UUID> param-name=other-config param-key=lvm-conf

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Signed-off-by: Benjamin Reis <benjamin.reis@vates.tech>
Last commit: 9207abe
"fix(linstor): check if resource is tiebreaker (#62)"

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
…#73)

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.fr>
Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
Wescoeur and others added 12 commits September 15, 2025 18:56
Impacted functions: `_get_volumes_info` and `_get_volume_node_names_and_size`.

Before this change "usable_size" validity was
checked too early and which could lead to an exception for
no good reason while the size could be known on at
least one host despite an issue on other machines.

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
…ll context

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
…mutators

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
Session attr is not set during "attach/detach calls from config".
In this context local method must always be called.

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
A change in lvm2 `https://github.com/xcp-ng-rpms/lvm2/pull/3/files`
introduces an issue in LargeBlockSR: `/dev/` is not scanned now meaning
the loop device is never used for VG activation. So we must add a custom
scan parameter to LVM commands.
We also now systematically do the call to _redo_vg_connection to use our
custom parameters to enable the LV on the correct device before calling
`EXTSR.attach()`.

Signed-off-by: Damien Thenot <damien.thenot@vates.tech>
This is not done on every and each implementation of SR but only on ones that calls cleanup.start_gc_service (like FileSR)
and on the classes that inherits from them and don't call super on detach.

This is to prevent useless errors logs like Failed to stop xxx.service: Unit xxx.service not loaded.

Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
When the pool master is changed and if it doesn't have a local DB path
then `get_database_path` fails during SR.scan call.
This patch allows creating a diskless path if necessary.

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
…104)

In `_request_device_path`:
Before this change, an exception was thrown when a resource was missing,
but not when the returned path was empty. Now it's raised in both cases.

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
@Lankou66 Lankou66 force-pushed the allow_CBT_disabling_on_snapshot branch from 6642746 to fc458c8 Compare November 12, 2025 13:19
@stormi
Copy link
Member

stormi commented Nov 12, 2025

Here's a reminder regarding the commit message conventions. The commit title is largely over 70 chars, here :)

https://docs.xcp-ng.org/project/development-process/commit-message-conventions/

@Lankou66 Lankou66 force-pushed the allow_CBT_disabling_on_snapshot branch from fc458c8 to 315794d Compare November 12, 2025 15:39
@Lankou66 Lankou66 changed the title implement the capability to desactivate CBT on all vdi snapshot child… implement the capability to desactivate CBT on all vdi snapshot child Nov 12, 2025
@Lankou66 Lankou66 marked this pull request as ready for review November 13, 2025 09:57
@Wescoeur Wescoeur requested review from Nambrok and Wescoeur November 14, 2025 10:26
Restrict `coverage` version to avoid broken ci

Signed-off-by: Antoine Bartuccio <antoine.bartuccio@vates.tech>
@Lankou66 Lankou66 force-pushed the allow_CBT_disabling_on_snapshot branch from 315794d to 2648b19 Compare November 17, 2025 10:10
drivers/VDI.py Outdated
return [
self.session.xenapi.VDI.get_uuid(ref)
for ref in record.get("snapshots", [])
if self.session.xenapi.VDI.get_uuid(ref)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two call for the XAPI is not optimal but also, this will raise if ref is not a correct OpaqueRef

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also return the ref directly since you use it later

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've removed the comprehension list
now we get ant return the list of rest directly

drivers/VDI.py Outdated
blktap2.VDI.tap_unpause(self.session, sr_uuid, vdi_uuid)

def _disable_cbt(self, vdi_uuid, visited=None):
# Prevent infite loop
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Prevent infite loop
# Prevent infinite loop

I don't think you need to do this, you should call for the VDI and then the same function doing the disable on the snapshots.
The snapshot won't have snapshots.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i removed it
indeed this protection was useless

@Lankou66 Lankou66 force-pushed the allow_CBT_disabling_on_snapshot branch from 2648b19 to 832aef0 Compare November 20, 2025 09:35
drivers/VDI.py Outdated
self.session.xenapi.VDI.set_cbt_enabled(vdi_ref, False)
for snapshot_ref in self._list_vdi_snapshots(self.uuid):
snapshot_uuid = self.session.xenapi.VDI.get_uuid(snapshot_ref)
self._disable_cbt(snapshot_uuid)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can do a function doing the above with just the ref and not the snapshot loop.
And call this once for the normal VDI then for the snapshot refs.
It might need to go deeper in self._cbt_op to know if they use info stored in self that are from the main VDI.

- Recursive CBT disabling on VDI all child snapshots
- Delete CBT log files when present

This resolves issues where CBT deactivation was incomplete
when processing multi-level snapshot chains.

Fixes: incomplete CBT deactivation on snapshot chains
Signed-off-by: Goulven Riou <goulven.riou@vates.tech>
@Lankou66 Lankou66 force-pushed the allow_CBT_disabling_on_snapshot branch from 832aef0 to c9dcd5f Compare November 21, 2025 09:54
@Wescoeur Wescoeur force-pushed the 3.2.12-8.3 branch 3 times, most recently from 66c7c8e to aaaab27 Compare December 1, 2025 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.