LXD fails to pickup non-pristine disks #142

ethanmye-rs · 2023-07-20T12:29:15Z

In the microcloud init screen, the wizard seems to fail to pickup non-pristine disks. It offers to wipe the disk in the next screen, so I assume this is a bug. If I wipe a non-pristine disk with:

sudo wipefs -a /dev/sdb && sudo dd if=/dev/zero of=/dev/sdb bs=4096 count=100 > /dev/null

then microcloud picks up the disk next time the wizard is run.

masnax · 2023-07-20T12:35:29Z

At the moment, MicroCloud won't pick up any partitioned disks. That will definitely change in the near-ish future.

masnax · 2023-11-20T17:08:14Z

We're close to having support for partitions on local (zfs) storage, but it seems ceph might take a bit longer:
canonical/microceph#251

For ZFS, we'll be able to add partition support once canonical/lxd#12537 is merged in LXD.

tomponline · 2023-11-21T08:24:03Z

@masnax WRT to canonical/lxd#12537 why do we need to ascertain if the partition is mounted, as isn't microcloud only showing empty partitions anyway?

dnebing · 2023-11-23T05:02:59Z

Because I couldn't add partitions as local storage during the microcloud init command, I chose "no" when asked about adding local storage. Microcloud completed initialization completely and it all looks great.

Is there a command I can execute to manually create the local storage pool and add the partitions from the cluster nodes?

At least until this new feature is ready?

masnax · 2023-11-23T05:14:48Z

Because I couldn't add partitions as local storage during the microcloud init command, I chose "no" when asked about adding local storage. Microcloud completed initialization completely and it all looks great.

Is there a command I can execute to manually create the local storage pool and add the partitions from the cluster nodes?

At least until this new feature is ready?

Sure, to create a local zfs storage pool like MicroCloud would, you can do the following:

Once on each system:

lxc storage create local zfs source=${disk_path} --target ${cluster_member_name}

And finally, from any system:

lxc storage create local zfs

dnebing · 2023-11-23T05:36:17Z

Thanks for that, extremely helpful!

I noticed per the doco that there are default volumes (backups, images) tied to the target systems.

Are those required, or should I just skip them?

masnax · 2023-11-23T13:49:55Z

@masnax WRT to canonical/lxd#12537 why do we need to ascertain if the partition is mounted, as isn't microcloud only showing empty partitions anyway?

There's no way MicroCloud can know if the partitions are empty without LXD's super-privileges. So no it will list every single partition on the system. The list is ripped straight from lxd info --resources.

tomponline · 2023-11-28T13:34:23Z

@masnax I commented over at canonical/lxd#12537 (review)

tomponline · 2024-02-05T13:55:51Z

MicroCeph support for partitions is being tracked here canonical/microceph#251

rmbleeker · 2024-10-22T13:52:42Z

Sure, to create a local zfs storage pool like MicroCloud would, you can do the following:

Once on each system:
lxc storage create local zfs source=${disk_path} --target ${cluster_member_name}
And finally, from any system:
lxc storage create local zfs

When I follow these instructions to the letter, or even when I add sudo, I always get the same error:

Error: Failed to run: zpool create -m none -O compression=on local /dev/disk/by-id/mmc-BJTD4R_0x5edae852-part3: exit status 1 (invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-id/mmc-BJTD4R_0x5edae852-part3 is part of active pool 'local')

This is on a TuringPi 2 clusterboard with 4 Turing RK1 nodes (Rockchip 3588 based compute modules with 32GB of eMMC storage). The nodes were freshly imaged and the 3rd partition was newly created on all of them using parted before turning on the nodes, to prevent the second partition which is the root partition to grow to the full size of the eMMC storage. What am I missing?

roosterfish · 2024-10-22T13:59:05Z

What am I missing?

It looks there already is a storage pool called local which is using the disk /dev/disk/by-id/mmc-BJTD4R_0x5edae852-part3.

You can run zpool list on your system to verify this.

@rmbleeker Have you skipped local storage pool setup during microcloud init?

rmbleeker · 2024-10-22T18:47:10Z

It looks there already is a storage pool called local which is using the disk /dev/disk/by-id/mmc-BJTD4R_0x5edae852-part3.

You can run zpool list on your system to verify this.

I realize that's what it looks like, but it's not the case. zpool list came up empty (no pools available). In fact it still does since the storage pool is still pending and hasn't been created yet.

@rmbleeker Have you skipped local storage pool setup during microcloud init?

Yes I have.

rmbleeker · 2024-10-22T19:38:41Z

Alright, it seems to work when I pick a different approach and slightly alter the commands. I got the idea from the Web UI, which states that when creating a ZFS storage pool, the name of an existing ZFS pool is a valid source. So I created a storage pool with

sudo zpool create -f -m none -O compression=on local /dev/disk/by-id/mmc-BJTD4R_0x5edae852-part3

on each node, filling in the proper disk ID on each node. I then used

sudo lxc storage create local zfs source=local --target=${nodename}

to create the local storage, filling in the name of each node in the cluster as the target. Then finally

sudo lxc storage create local zfs

properly initialized the storage pool, giving it the CREATED stage instead of PENDING or ERRORED. It cost me an extra step which isn't a big deal, but it's still a work around and not a solution in my view.

masnax · 2024-10-22T20:07:32Z

Out of curiosity, if you have another partition you're able to test on, I'd be very interested to see if the storage pool can be created with a name other than local?

The setup that eventually worked for you seems to just ignore the existing pool error with the -f flag. It's not yet clear if this is an issue with existing zpool state or some race when creating the pool in LXD.

rmbleeker · 2024-10-23T01:37:33Z

There are no other disks or partitions available on the nodes, but since I wasn't far into my project anyway I decided to do some testing and flash the nodes again with a fresh image. I did this twice and set up the cluster again both times. After the first time I used the lxc storage create commands to create a ZFS storage pool with the partition as it's source, giving it the name local-zfs. This got me the same errors, leaving the storage pool in the ERRORED state. The second time I used zpool create to first create a pool named local-zfs on the partition, and then used the lxc commands to use that pool as a source for the storage pool. This worked without using the -f flag to force overriding an existing pool, except on node 2 where it claimed a pool named local already existed on the partition.

With all that said and done these tests weren' conclusive. The fact that the issue still occurred on node 2 after applying a fresh image leads me to believe that some remnants of the contents of a partition are left behind when you re-create the partition with exactly the same parameters if the storage device isn't properly overwritten beforehand. But apparently that's not always the case because I could create a new pool without forcing it on 3 of the 4 nodes.

In any case I think that perhaps a --force flag should be implemented for the lxc storage create command, which is then passed along to the underlying command that is used to create a storage pool, just so you can resolve errors like the one I ran into.

roosterfish · 2024-10-23T09:32:44Z

In any case I think that perhaps a --force flag should be implemented for the lxc storage create command

You can already pass source.wipe=true when creating the storage pool to wipe the source before trying to create the pool.

roosterfish added the Feature New feature, not a bug label Nov 22, 2023

roosterfish mentioned this issue Mar 25, 2024

Drives available but getting "Insufficient number of disks available to set up distributed storage" #271

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LXD fails to pickup non-pristine disks #142

LXD fails to pickup non-pristine disks #142

ethanmye-rs commented Jul 20, 2023

masnax commented Jul 20, 2023

masnax commented Nov 20, 2023 •

edited

Loading

tomponline commented Nov 21, 2023

dnebing commented Nov 23, 2023

masnax commented Nov 23, 2023

dnebing commented Nov 23, 2023

masnax commented Nov 23, 2023

tomponline commented Nov 28, 2023

tomponline commented Feb 5, 2024

rmbleeker commented Oct 22, 2024

roosterfish commented Oct 22, 2024 •

edited

Loading

rmbleeker commented Oct 22, 2024

rmbleeker commented Oct 22, 2024

masnax commented Oct 22, 2024 •

edited

Loading

rmbleeker commented Oct 23, 2024

roosterfish commented Oct 23, 2024

LXD fails to pickup non-pristine disks #142

LXD fails to pickup non-pristine disks #142

Comments

ethanmye-rs commented Jul 20, 2023

masnax commented Jul 20, 2023

masnax commented Nov 20, 2023 • edited Loading

tomponline commented Nov 21, 2023

dnebing commented Nov 23, 2023

masnax commented Nov 23, 2023

dnebing commented Nov 23, 2023

masnax commented Nov 23, 2023

tomponline commented Nov 28, 2023

tomponline commented Feb 5, 2024

rmbleeker commented Oct 22, 2024

roosterfish commented Oct 22, 2024 • edited Loading

rmbleeker commented Oct 22, 2024

rmbleeker commented Oct 22, 2024

masnax commented Oct 22, 2024 • edited Loading

rmbleeker commented Oct 23, 2024

roosterfish commented Oct 23, 2024

masnax commented Nov 20, 2023 •

edited

Loading

roosterfish commented Oct 22, 2024 •

edited

Loading

masnax commented Oct 22, 2024 •

edited

Loading