New servers for XCP-NG?

I need to budget for new servers and want to move everything from real hardware to an XCP-NG system, budget is of course an issue. But the question is would you go with a single CPU or dual CPU system? This is for my production system at work so a fair bit more money than my out of pocket lab.

I’m seeing enough core count in Intel Silver or Gold processors that I could go with a single CPU. Ram on the other hand is not exactly what I’m looking for, but maybe I can make up some ground on the price.

What I want is around 24 cores (48 threads) and 128GB or more. With over provisioning I could probably get by with 10 to 12 cores and 96GB.

The goal is to have 3 of these servers and run in HA, local storage is not something I’m looking at unless XO Store matures quickly, I’ll have at least 4 drive bays in any chassis I buy.

Looking at Supermicro because they have worked well for me for the last 18 years, may look at Dell too.

I’m also trying to decide on 1u or 2u servers, 2u is much easier to work inside and may run cooler. Probably will run the OS drives as RAID1 and a 2u chassis would give me at least 6 more drives for storage if I want/need to have local storage.

Any help on this choice would be appreciated.

If any of the VMs will run windows, it’s difficult to over-provision cores. Since windows is so noisy they will frequently send interrupts to the CPU and if over-provisioned will cause other VMs to stall (simply go slow since this is happening in nano-seconds) when they need CPU cycles. This is compounded when you treat hyper-threaded cores as real cores (act as if you have 48 and provision 60 on a 24 core CPU).

My experience comes from running Horizon VDI and a crap ton of Win10 machines. I learned this lesson the hard way. VMware’s documentation does state somewhere not to exceed the physical core count for running machines (too lazy to source/cite that).

I can’t say how this would translate for xenserver / xcp-ng, but considering the issue is windows plowing through CPU cycles like a heartbeat I would imagine the issue to be present in any hypervisor. So just something to keep in mind. One solution may be to container-ize some of your services where possible.

I’m not sure what your goals are for storage, but using silo storage with 3 hypervisor servers can quickly rob you of IOPS - especially if you do not have flash pools / volumes. SATA spinners top out around 200 IOPS, so you would be relying on cache or the controller to handle large demands. If possible get some flash storage, or pickup glusterfs / ceph and make the 3 hosts a hyper-converged solution with distributed storage.

However, nice thing about storage is you can always start at the bottom and work your way up - moving the VMs to the new storage solution.

I haven’t looked at server prices in a while since we just bought ~20x Dell VxRail systems in 2019, but keep an eye on chip shortage news etc… if it can wait it may be cheaper next summer.

Thanks.

My plan is to not over provision anything, but my plan does not equal administration’s plan. If I could get 40 threads per box, I could have a minimum over provision when the other two hosts were down. That was my goal from the start but it’s looking grim.

I need 5 Windows servers with 8 “cores” (really threads) and at least 8GB ram each (16 would be better for some). Two of those servers will be running AD

As far as price goes, I’m looking to budget now and purchase (if I’m lucky) won’t be until summer or after September 2022, if things take a price dive that just means that I can stuff more performance into the box.

XCP-NG does have a minimum and maximum number of cores that it can use per VM. Not going to pretend I’m fully up to speed on this yet, but it is a setting. Same goes for RAM.

I’ll have to keep loading up my lab system and see what happens when I get very over provisioned.

And yes I get what you say about the storage, XCP-NG doesn’t have a hyper converged system yet, it’s in Beta and I’d like to hope that I can buy that for my production system (it’s a paid feature). Again my goals != administration’s goals which is always frustrating. The administration’s idea would be to buy the cheapest used 4 core boxes they could find and run the OS on bare metal and only replace the three oldest servers, be in for $2500 total instead of $9000. Doesn’t address any of the other reasons to have a hypervisor system running or what happens when one of those pieces of trash burns up. I’m running on bare metal now, simply because it was cheap when I couldn’t afford Citrix Xen or some other big hypervisor. And make failover pools is not something I really want to look into with Windows HyperV, I dug into this a few years ago and decided it wasn’t what I wanted. With XCP-NG being all grown up (and rather easy to handle), it is time for this change.

But it seems like physical cores are physical cores and I shouldn’t be too concerned over single processor or dual processor as long as I have enough cores to make it go.

I don’t know exactly what you are looking for or what the budget calls for but if you are looking for cores for cheap, I have had good luck running XCP-NG on this server. It runs AMD Ryzen CPUs with which you can get significantly more cores for a ton less money.

I have one system with a 3900x-12 Cores 24 threads and 64 GB of ram running several windows VMs with no issues at all. The system also comes with dual gigabit LAN, 4 3.5" SATA bays and a license free HTML5 IPMI.

The system is 1U and supports certain ECC ram configurations. I haven’t noticed any real issues with noise or heat in a pretty warm server room. Every once in a while I can hear the fans spin up but it never really lasts long and the idle noise is no more than a desktop fan. For $9000 3 of these systems with 12 or 16 cores and 128 GB of RAM would be no big deal.

Your only downside might be only 4 ram slots.

Thanks, I need to look into AMD servers more because you are right, normally more cores per dollar. The lower the cost, the more likely my plan will be met.

I did price some new Dell stuff… I left pretty quickly when I started to add RAM and the prices went through the roof. I hit $7500 for a single server really fast, and that’s not going to happen.

Ah I gotcha, have a single node able to handle the other 2 nodes’ load in case of an emergency. Would certainly be good to have a disaster recovery plan, and figure out what can be turned off (example 1 of 2 DCs) during an emergency etc…

I work at a school and as such our network is not complex as much as it is big-ish. We have several campuses across the states with a user base of around… 5k? So small to medium sized. We have 2 DCs with only 4 cores and 16gb each… Looking at the last month performance charts, we never go over 5% memory used and same for CPU usage. In all honesty you could get away with 8gb - Server 2016 and above have gotten quite efficient in ram usage.

With my limited knowledge being from the vsphere world - more cores isn’t always needed since ESXi will actually distribute the load across all cores depending on priority. It’s only when you have an application that is designed to work with 8+ cores that you’d see a real benefit by assigning that many. Not to mention Microsoft has switched to charging you by core count anyway. Something to consider anywho, when planning it all out.

Yea I saw an option in XOA (XOSAN) but yea looks like a paid for option - despite being self-compiled. At the very least storage is something that can be easily evolved unlike the servers themselves.

If you’re only running 5 servers total, then normal disk arrays will be fine most likely (assuming you don’t have some crazy data heavy I/O apps). Maybe re-evaluate what they actually need, do a little bit of test and tuning. Windows AD really doesn’t require much resources until you start having hundreds of thousands of login events per minute. Which I would’ve never believe possible until I work where I work now…

My maximum is around 60 users logging in at once, never been much of a problem even with roaming profiles stored on a Truenas share. I can probably get away with very little, at least until it comes time to image one of my classrooms and I’m trying to pull from a WDS server. Right now my WDS has am SATA3 spinning drive in it and normally it is fast enough that I can only get the fifth or maybe sixth machine booted and ready for transfer before the first is done, I don’t mess with multicast because I just can’t move fast enough though the boot and prep sequence to matter.

The most used server is probably my McAfee ePO which also has something else running on it, and it is normally pretty good with it’s 4 core Xeon and I think only 8GB. I’ll have to look at it when I log in to do updates. It’s a little E3 processor so no real upgrade path or I might use it for a host since it is fairly new. The downsides of buying just what you need, if I could chuck a 10 core processor in that thing, I’d be part way there.

I think I’m just going to go single processor and see if that helps me to be able to buy them. I’ll get a scalable version in case I need more cores in the future, think I can get up to 24 or 26 cores (plus hyper threading) in a single Gold or Platinum processor for future needs.

If you need to run a lot of Windows servers but don’t want the overhead of Windows, check out https://zentyal.com for a Linux based alternative. Used widely in Europe by many government organisations.

I have Zentyal running at home, but haven’t really taken the time to dive deeply into it.

1 Like

I was in your boat a few months ago, and I ended up looking at used servers. I would recommend something like a pack of Dell R730xd’s (E5-2660v3) 2x10 core = 40 threads total per machine, 128GB of DDR4 ram, and no disks, cost ~$1800 US / box. You can then look at swapping out the CPU’s for cheap, and upgrading the NIC (mine came with dual 10G and Dual 1G interfaces). Since it sounds like your storage is already a non-issue, this would allow you to deploy 3 at least (and could probably fit 4) for what seems like your budget.

If you have to go new, then yes, you cant go wrong with AMD at this point, but all new servers get expensive, quick.