XCP-NG High Availability

I am trying to wrap my head around HA when it comes to XCP-NG specifically if its needed/required.
This assumes shared storage like NFS lets say…
The scenario is if I have say 3x XCP-NG servers in a pool - No HA configured - and one of the servers in the pool dies while hosting a VM, does that VM automatically start up on another server in the pool?
Or lets say i need to maintenance a server and i install my updates and click reboot, those VMs would be moved to another server.
If these conditions are true then why should HA be enabled?

Documentation seems to nudge away from using HA

Without HA on any VMs, but with autostart on the VMs here is what I have seen happen (but might not always happen). This also assumes shared storage (NFS or iSCSI) for the VMs.

The host that is running the VM goes dead, after several minutes the VM will autostart on another host. This does leave you with a problem if the dead host powers back up, it may try to launch that VM again. I’ve only tested this once or twice.

If you are installing updates on the pool, you can use the rolling pool update. This will migrate each VM off of the host that is getting updated, reboot that host, then migrate VMs back to the host as well as migrate VMs off the next host that needs updates. This automatic migrations requires that you have the management agent installed on each VM, it’s failed every time I try this without the management agent installed, at least with Windows VMs.

Any other work you need to do on a host will require you to migrate the VMs manually (to be safe), but definitely for a reboot. If you use the reboot button in XO, then it may migrate the VM automatically, haven’t tried it myself.

Interesting. Maybe im misunderstanding here but based on what you wrote there doesn’t seem to be much advantage to setting up HA. As long as you got multiple servers in the pool the VMs would migrate…for maintenance reasons or if you pull the plug on a server

I have a video here that explains how HA works in XCP-NG

I’ve watched your video a few times so I get the concept but where I’m confused is what does HA provide that having a few servers in a pool doesn’t provide?
If a server goes down in my pool doesn’t it still start up on another server without a HA configuration?

No, if if a XCP-NG host stops on a NON-HA setup, it is up to you to restart all the VM’s that were on that server server. If a HA-Pool is configured and you have a VM with the HA enabled under the advanced tab and the XCP-NG host that server is running on stops then that VM will restart on another available XCP-NG host in HA-Pool.

We mostly don’t find HA necessary because just having all the VM’s in the same resource pool using shared makes it simple to move a VM from one XCP-NG host to another, and unless you are running on really old worn out hardware, host failures are not very common.

The autostart situation that I described is not guaranteed, I’ve seen it work but not something I would rely on, and it can take many minutes to decide to start again. HA starts the VM the next time the heartbeat happens and it sees the VM down. HA also checks the condition of all VMs when the failed host starts back up to make sure it isn’t starting a VM that is already running and in a newer state. HA is very clever and I’m still deciding on if I want to set it back up on my production system.

Also again, to get the full benefits of running in a pool, you really need a shared storage space or hyperconverged storage.

I’d suggest getting some old servers (3 of them) and something to work as shared storage, and setting up a lab to test some of this.

I guess this goes into my other question regarding HA.
So far for deployments we deploy 2x Supermicros within a datacenter. On these SMs they run an instance of say firewall (Palo) or SDWAN(Versa).

What is the proper way of virtualizing a network function such as a firewall? Does it make sense to have a shared resource pool or it is common to to have your virtual instances on local storage? HA within the instance is spread to the other server on site. Also in that design does it make sense to set up the Hypervisors in HA anyway?

You need at least 3 XCP-NG hosts for and shared storage for HA to work.

Also if you have VMs on local storage, you can’t do automatic rolling pool updates/upgrades. Or at least it isn’t working for me. If you bought XOStor then things might be different, but I haven’t been able to test it.