in order to best utilise a server with 2x4gb drives I have installed xcp-ng and created 2 vms

1 on 1st drive / sr (2x2tb disks merged with lvm to create 4gb ish)
2nd on 2nd drive / sr (2x2tb disks merged with lvm to create 4gb ish)

These both do nothing other than run rsnapshot (this method halved the time it takes 1 server to back up the same number of servers)

So far good. I am really happy with the performance and everything was working great… till the file system on the second vm went read only… no problem… quick reboot and fsck… nope… this disk is trashed! 4 times through and still showing errors… lots of errors.

So… there is a good chance this is just a broken hard drive…


smartctl --test=long /dev/sdb


smartctl -l selftest /dev/sdb
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.19.0+1] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

1 Extended offline Completed without error 00% 2934 -

2 Short offline Completed without error 00% 1904 -

3 Short offline Completed without error 00% 1904 -

So if there is no problem on the disk… how on earth did the VM get so completely trashed?

am I doing anything inherently stupid by setting up like this? All the vms does is to pull a bunch of files over the network once a night so not all that much work

Just because Smart does not report an error does not mean the drive does not have any issues. Go through the logs in /var/log/xensource.log & /var/log/xenstored-access.log and look for error messages.

I am guessing none of this is good news for the drive…

That’s hard to read, always better copy paste logs instead of screenshots, but yes that looks bad.