XCP/XOA - Any risk in this hardware setup (and a few other questions)?

Hello,

So I am in the somewhat early stages of planning out a XCP-NG/XOA infrastructure deployment for both Production and UAT/Test environments.

The goal:
Move our currently externally hosted Microsoft Dynamics 2016 CRM and Website to our own infrastructure (this will save us $75k/y in hosting fee’s…). Along with a few other VM’s. So public DMZ is needed.

back story:
We were originally going to use 2x VMWare Essential licenses to manage the whole setup, but I then discovered XCP-NG and I’m keenly intrigued to try it out (all the additional functionality is hard to beat). So all the hardware was originally purchased with the intent to use VMware, so I might not have things optimally setup for XCP-NG, but that is why I’m here.

The hardware:

I had a killer opportunity to buy 6 servers for significantly cheaper then my original budget to buy 1 high end current Server (was looking at a single AMD Epyc based server, but could only afford 1 ~$23k).

  • 4 x Dell R620s (128GB DDR3 Ram, E5-2660 V2, no HDD, 2/2 SFP+/RJ45G Nics)

  • 2 x Dell 730xd’s (128GB DDR4, E5-2660 V3, no HDD, 4 SFP+/2 RJ45G). vendor only had 2!

  • Synology FS3400 with ~48TB and 1.6GB of NVMe Cache.

  • Mikrotik CRS317-1g-16s+RM (running purely in switch mode),

  • 2 x Mikrotik CSS610-8G-2S+in (also running in switch mode),

  • Currently a compiled version of XO just to get setup, test the setup before committing to the enterprise license (and transferring the config to a proper licensed XOA).

All of these servers are interconnected with dual 10GB interfaces (still have to learn XO Networking) via a mikrotik crs317-1g-16s+rm purely for the VM traffic, and all actual vm network traffic will traverse a set of 2 x mikrotik css610-8g-2s+in each configured to do DMZ based traffic (Web and VPN), and all other network traffic will traverse a (to be replaced) Dlink DGS-1210-24P (we are re-using it for now, but will be replaced soon enough).

All of this traffic traverses a Fortigate firewall that is capable of handling a maxed out 1Gig fiber connection with all inspections enabled.

The question:
I’m just setting up my PROD/UAT Pools, and I was thinking:

Prod: 2 x R730XD’s and 2 x R620s
Uat: 2 x R620’s.

All VM storage will live on the Synology NAS using NFS/iSCSI (most NFS, SQL database will use iSCSI).

Will having mixed generations of servers cause problems for HA? (I understand XCP-NG will operate with the lowest CPU capabilities). I’m just not sure if there is other “hidden” problems that will arise, that isnt well documented.

The other thought I had, was doing a 2 x 2 x 2 setup:

Prod: 2 x R730’s
Prod-Backup: 2 x R620’s
UAT: 2 x R620’s.
With the idea that should “Prod” go down, all VM’s can be moved to Prod-Backup if needed (but somehow this seems… not the best use of hardware to me).

Second Question:
Does anybody have a good video link explaining how to properly setup each servers 10Gig interfaces to a specific vlan for vm traffic only, and then assign the other 1Gig interfaces as dedicated to a specific VM as needed (for my DMZ for example) ?

Third Question:
Does anybody know if you can transfer the entire config from a compiled XO setup, to a Licensed XOA server?

Looking forward to your recommendations.

When mixing servers in the same pool as you noted XCP-NG will operate with the lowest CPU capabilities but also if the servers are not the same but in the same pool and the network interface is not the same, such as eth0 being 1gb on one system but eth0 being 10GB on another system you will have to do some per system arranging of the networks to get them in the correct order.

HA requires at least 3 servers and for the most trouble free setup they should be identical, but do you really need HA? HA complicates things and if a server fails manually restarting the VM on another server is really easy and much less complex.

I have networking covered here

And transferring the config should not be an issue between versions of XO.

1 Like

Could you describe how HA complicates things?

I’ve seen that mentioned, but nobody has said how. The main reason for HA, is because I will be hosting a publicly accessible website. This production environment is only expected to be alive for ~2 years while we work on replacing the CRM. It was HA and the ability to move live VMs across servers as the main driver to switch to XCP (over the high expense of VMWare licensing to get access to these features).

Just more complex to setup and configure because you need at least three hosts and redundant switches to do it well, but it does allow the VM to automatically restart on another host if a host fails without any user intervention. We only have two servers for many of our clients because if a system fails we first examine why it failed first then if the host can not be brought back in a reasonable / acceptable amount of time we can simply start it on the other host. Also, you don’t need HA for live migrations to work.

Older post but they have a discussion here about the scenarios for HA here

1 Like

Interestingly, I double checked my ethernet adapters, and I didn’t realize that my 730’s come with essentially the same NIC as my R620’s from a port to port perspective. I did add a dual SFP+ card to both before realizing that 2 of the 4 RJ45 ports are actually 10Gig. I just need to order 2 Gbic10-T adapters for my switch to accommodate them and remove the two extra cards to make the servers match entirely. This should allow me to do HA with minimal concerns. I do plan on purchasing a second Mikrotik CRS317-1g-16s+RM switch as a fail-over switch (but they are so cheap, why not).