Local ZFS on XCP-NG

bjdemon · November 28, 2023, 3:18pm

Hi all

We bought a new server that runs XCP-NG and we would like to use ZFS for the storage.
It has 2 nvme of 1TB mirror for XCP-NG.
It also has 4 2TV nvme for local storage formatted as a ZFS pool in raidz1.

I have a few questions regarding best practices.

Is it a good practice to install my VMs directly on this pool, or should I use datasets instead?
I created a zvol (block storage) for a VM. Should I format this zvol with LVM/EXT4 before installing a VM on top of that, or should I pass this zvol directly to the VM as a block device and only choose the filesystem inside the VM?

2.1) If I should format the block device before presenting it to the VM, what filesystem should I choose? Let’s say the block storage is formatted as LVM and the VM itself on top of this LVM zvol uses LVM too, is this considered good practice?

In case both methods are considered valid, what are the ups and downs of both these methods?
The total capacity of the ssds show by zfs is around 5.5TB, but XCP-NG only shown 4.8TB. What can cause this?

I’m coming from Microsoft Hyper-V, so my apologizes if these questions are basic.

paolo · November 28, 2023, 4:36pm

You cannot use a ZFS pool directly as a storage device, you’ll always need a dataset or volume. That being said, every pool has a root dataset with the same name as the pool. It is not a good practice to store data in the root pool in m opinion, you should create a sensible dataset hierarchy instead. If you really need to, you can even reorganize this hierarchy at a later point.

Leave the filesystem and formatting up to the guest system.

Since you mentioned you have 4x 2 TB drives, I assume you are using Raid-z1. This will give you a total pool capacity of 6 TB (Terabytes, decimal prefix) = 5.5 TiB (Tebibyte, binary prefix). When using zpool list or zfs list, ZFS will use the binary prefix system, so 5.5T means 5.5 TiB, therefore the number shown by ZFS is correct in your case.

As to where the discrepancy comes from, I have no idea. I don’t use XCP-ng so I don’t know where the “4.8TB” would be shown and what it refers to. Could it simply be the remaining free storage after you installed some VMs?

xMAXIMUSx · November 29, 2023, 12:52am

Their docs are pretty good for setting this up.

LTS_Tom · November 29, 2023, 12:25pm

I also have a video covering storage option in XCP-NG

bjdemon · November 29, 2023, 3:53pm

Thank you, your info was very insightful!
It’s a clean system, so no VM’s installed on it. Still haven’t figured that one out.

bjdemon · November 29, 2023, 5:00pm

Thank you, I have read the documentation, but not everything was/is clear regarding best practices (thing like just I install a VM on a zvol or directly on a dataset).
Also finding out how to attach a zvol to a vm to install an os on it seems to be impossible (probably because I’m finding it hard to wrap my head around certain concepts).

I have seen one of Tom’s videos where he is doing the exact same thing I want, but on TrueNAS (attaching a zvol to a VM thorugh the GUI), and it seems so simply, but on XCP-NG I can’t really seems to make it work.

LTS_Tom · November 30, 2023, 12:24am

You DO NOT attach the ZVOL from a ZFS mount on the host itself, that way I do it in the video is using ZVOL on TrueNAS to present iSCSI as a storage target for XCP-NG.

bjdemon · December 1, 2023, 4:58pm

Hi Tom

Thank you for taking the time to response.

I meant the following video of yours, (after 9:15m):

There it seems to me that you are attaching a local zvol to a vm as a disk (no isci or nfs…).
I want to achieve the same thing on XCP-NG.

thank you

xMAXIMUSx · December 1, 2023, 7:01pm

I’m not sure what you are trying to do here. If you need to have a VM need to have a drive connected that is from a local ZFS pool then all you need to do is create a new drive from the ZFS SR within xen orchestra.

LTS_Tom · December 1, 2023, 7:45pm

That is not possible in XCP-NG and the VM’s in that video were using the TrueNAS hypervisor.

bjdemon · December 1, 2023, 8:11pm

I’m trying (or rather was trying) to create a ZVOL, and use that ZVOL as a disk for a VM. But apparently that is not possible in XCP-NG as Tom has pointed out (it is in TrueNas).

bjdemon · December 1, 2023, 8:12pm

That clears everything up. I was aware that you were using TrueNas, but wanted to achieve the same thing on XCP-NG.

xMAXIMUSx · December 1, 2023, 8:31pm

What are you gaining by trying to attach a zvol to a VM? That is what you are essentially doing when you create a new virtual drive in xen orchestra.

bjdemon · December 1, 2023, 8:49pm

If I’m correct they are not entirely the same. When you use xen orchestra to create a virtual drive, it just creates a VHD file (VHD files have some limitations).

A VHD file is not the same as a zvol. A zvol is more like a using the RAW format feature that XCP-NG offers, but with added benefits of ZFS.

xMAXIMUSx · December 1, 2023, 9:18pm

You are still getting the benefits of zfs because the VHD is on that dataset. I have a local ZFS on my xcp-ng host. You are correct that VHD’s cannot be grown while the VM is running, but that is a limitation of XCP-ng. I know they are working on supporting more formats but, I don’t think they have it as a priority.

You still get snapshots and data integrity with synchronous writes on the dataset.

bjdemon · December 2, 2023, 11:08am

Thank you for sharing your experience. It helped me with some new insights and less worrying. Will for sure hit you up if I have any questions

devopspadawan · January 18, 2024, 3:01am

It is actually possible to passthrough a zvol to a VM in XCP-NG. I have done it, but I have not tested this extensively. If you are looking to try it, just create a zvol like:

zfs create -V 100G zfspool/MyZvol

You will then have a device at /dev/zd0

Then follow the instructions for passing through physical disks and creating a local SR. You will eventually end up with a symbolic link from /dev/zd0 to /srv/pass_drives/zd0. Just make sure you do not try renaming the link to something fancy or it will not work.

After creating the SR and rescanning it, you will see a disk labeled as unknown bus type. However, you should see the accurate disk size. You can go into the disk properties and change this name to something more descriptive like VmName-Disk1 or whatever.

The only thing to keep in mind is that you will not be able to attach the disk to the VM at creation time. You will have to let the wizard create some temporary disk. Then you can shutdown the VM, delete the temp disk, and attach the zvol-backed disk. Then you can boot to ISO and install whatever OS. The process is the same whether in XCP-NG Center or XO.

This is as far as I have gotten in my lab so far, so I don’t know if something could break. If you are still interested in doing this, I hope this works for you.

devopspadawan · January 18, 2024, 7:50am

After playing around with this for a bit, I have observed that under this approach, one loses an extensive amount of manageability of the VM within XCP-ng and XO. For instance, copying a VM is not possible unless the disk is detached beforehand.

However, with the ZVOL backed VM, I was able to make instant snapshots and clones of the ZVOL, much faster than through XCP-ng. With the ZFS snapshots I can now send them incrementally to another ZFS pool for backup or disaster recovery. With a file-based SR, you do really lose out on many of the benefits of ZFS if using just a single dataset as a local SR with a bunch of VHDs. ZFS snapshots are pretty useless in this scenario because you would have to snapshot the whole SR with all the VHDs in varying states which might end up making them all unusable after a restore. An SR driver would be ideal to manage all the ZFSisms under the hood while allowing for proper management of the VM from within the console.

A useable but promising compromise, perhaps, is creating a separate ZFS dataset for each VM and making each dataset a dedicated file-based SR. From the perspective of XCP-ng, each VM would be a typical VHD backed VM, and you would not lose any manageability of the VM. From the ZFS side, you would now be able to make instant ZFS snapshots of the SR specific to the VM without XCP even noticing anything is going on. Snapshots and rollbacks could be done behind the scenes without XCP being the wiser more or less. Cloning via ZFS might pose a problem, though, in that you would then have to deal with changing the UUID of the resulting SR, I think. I think that the added manual effort would be substantially less than with dealing with the ZVOL. I would have to play with this method and see. However, I think this method would probably give the most benefits from both sides. Although, I think a dedicated SR driver for this method would be even more ideal.

The final thing to keep in mind is that you would probably have to shut down the VM before doing any kind of ZFS operation on the currently live dataset, whether ZVOL or ZFS backed VHD, to avoid corruption. Working with the existing prior snapshots would be safe while the VM is running, I think.