XCP-ng loses iscsi lun after target reboots

ldellus · June 14, 2021, 8:53am

Hi,

I have a strange problem and i’ll appreciate guidance. I have an XCP-ng 8.2 host and a Debian 10 iscsi target. On the target I tried all combinations of tgt, LIO, LVM and zfs, and could make each work perfectly fine until a reboot of the target, at which point the host lost the SR, and the only solution was to start from scratch.

What I did:

On the target, created a VG named vg01, and an LV named lv01.
Setup the tgt configuration file with backing-store /dev/mapper/vg01-lv01
Restarted the tgt daemon
On the host, added an iscsi SR. It found the target immediately, attached it, and formatted it.
On the SR, created 3 disks

At this point, everything worked fine. On the target, I could see for each disk I created on the SR the following logical volumes under /dev/mapper
VG_XenStorage–efd60b3a–a27b–65ea–c235–2bbb87e8bf8b-MGT
VG_XenStorage–efd60b3a–a27b–65ea–c235–2bbb87e8bf8b-VHD–1caf0bc3–4d80–4363–8e85–62517dd34832
VG_XenStorage–efd60b3a–a27b–65ea–c235–2bbb87e8bf8b-VHD–c5ee92f5–b05b–4b77–9a33–612218e8da99
VG_XenStorage–efd60b3a–a27b–65ea–c235–2bbb87e8bf8b-VHD–f347b9ef–ba3a–4d68–ad3d–83f20edd5eaa
I could attach the disks to a VM, format them and use them inside the VM, no problems.

Then I rebooted the target. The SR disconnected from the target, as expected, and when I tried to reattach it, I got a backend error. From this point it was impossible to reattach the SR and recover the disks on it. The only solution was to forget the SR, delete all the VG_XenStorage logical volumes, restart tgt, and recreate the SR, which worked at this point. It seems that the logical volumes created by the host are screwing things up. I got the exact same problem with LIO, and zfs.

Thanks,

Laurent

LTS_Tom · June 14, 2021, 10:55am

All the iSCSI systems we normally use have been either TrueNAS or Synology and have always worked fine, even across reboots. Not really sure where the issue is but I would look on the Debian side.

ldellus · June 14, 2021, 11:53am

Thanks Tom for the quick reply. I was using openmediavault, and that’s where the problem started, so for debugging purposes, I thought I’d simplify the setup and use Debian. I agree with you, I think there’s a problem with the way Debian works as an iscsi target.
I’ll give TrueNas a try. The funny thing is that I hesitated between TrueNas and openmediavault.

Laurent