XCP-ng backup and DR - Windows Server 2025 with AD-DS

I’m going through and planning out a better backup and DR plan, and remembered something while thinking about the health checks. When doing a health check, does the process time out after XX minutes?

Yes I know, you are thinking that the management agent should be up pretty quickly, but there’s a wrench in the Server 2025 gears.

When running Server 2025, and running Active Directory (and maybe other modules), there is something that prevents the management agent from starting. After some fooling around I found that the simple answer was to go to services → select the MA service and stop it → then double click and set the service to automatic (delayed start). Best to then restart the VM to reload everything. Now the system will boot, and it takes “a long time” for the MA to start, could be 1 minute, could be longer, never really timed it on the single 2025 VM that I have in my lab.

So does the health check time out if the MA takes too long to start?

Back to the MA getting stuck… This has some side effects! Things like the Windows Installer will not function while the MA is locked up. So you can not uninstall the MA, or install updates, etc. Another user found a few other things that didn’t work during this condition, I’d have to go back through the thread to see what parts do not function.

This testing was done with MA/drivers from Xenserver versions 9.33 and 9.4, I have not tested this with the XCP-ng drivers/MA yet. Just haven’t had the kind of time I want to work on all this stuff!

This is quite interesting. I stood up a 2025 server a while back for testing purposes, but never tested the backups like that since it was a testing box.

I’ll do some testing on this and see what I find.

The important thing with the MA is that if you have AD-DS installed and configured, it is going to fail to start, which might fail the backup process. It may fail a migration process too, again I haven’t tested all scenarios with this issue yet.

I was using the Eval version, but one of the other people on the XCP forums was using the production version.

There’s also an odd thing where 2025 and the drivers allow you to safe eject the OS drive, there’s a registry hack for fixing this (also in the XCP forums), this drive eject also happens in win11 if memory is correct.

Well, think I did something wrong here:

Going to let it go for a while and see what happens. Going from lab pool with NFS storage to lab backup pool with local NVME storage, I don;t thinkt he storage is the problem. The XO is running on LAB and everything has a 10gbe connection. But the hosts are little HP T740 computers and this might be where problems start, 64GB of RAM though so should be OK.

The timeout for the Health Check is 10 minutes

healthCheckTimeout = '10m' # the default

There is a discussion in their forums about it.