Just watched your latest video Tom, and then I wrote a large comment giving explanation in Youtube that just disappeared
So I’m rewriting it here, maybe I lost some details but I expanded others.
I suppose @LTS_Tom could probably make a video version of this so it’s more convenient to listen/watch than read this big TL;DR
Note: sorry for any typos and grammar, I’m not double checking my text, it’s not meant to be a blog post
Xen vs XenServer: the hypervisor is just “Xen”, not “XenServer”. XenServer is the whole platform made by Citrix, and the source of the fork that gave XCP-ng.
To be even more precise, it’s called Xen Project, since Citrix didn’t want to give the Xen brand name to a project hosted in the Linux Foundation (to avoid business confusion). This project is a true GPLv2 open source project, with contribution from Citrix, Arm (a LOT! see the note below), Suse, AWS and so on. There’s also big names in the pre-disclosure mailing list (members can be seen here: Xen Security Problem Response Process - Xen Project BTW the security policy of the Xen project is one of the best I know in the entire IT world). Those users are in general big Xen users since they MUST patch some issues before it’s public, so it gives an interesting view.
Why Arm and Xen? In automotive world, Xen is a perfect fit: you can statically “partition” the hardware, eg one motherboard, with multiple CPUs assigned to various VMs. For example, one VM for entertainement, the other for ECU/critical asset. Without security risks nor memory data leak since the memory is truly isolated and Xen controls it.
And to add more complexity, Citrix renamed XenServer to Citrix Hypervisor, but that’s the exact same thing.
This is also a question I’ve seen often for people without deep knowledge on Xen Project and Citrix. Xen Project is the upstream, like the API called XAPI. Those projects are hosted in the Linux foundation. Everyone contribute to Xen as I said just before.
Citrix Hypervisor/XenServer is the downstream of these projects. XCP-ng is also a downstream at the same level than Citrix Hypervisor. So there’s one upstream for the hypervisor (Xen Project) and two downstream (XenServer and XCP-ng). So no, XCP-ng is NOT the downstream of XenServer.
Note: I’m not working at AWS, this is all I collected after years of various conversations with different people around the Xen and AWS world.
It’s simple: AWS became the 1st public Cloud provider thanks to the Xen Project. It’s a fact. Since it’s GPLv2, note you can modify the code without contributing back to the upstream, since as a Cloud provider, you aren’t actually delivering the software. AWS did that a lot, because they wanted to protect their competitive edge in the virtualization space for their Cloud service. They also built their own toolstack, which isn’t public nor open source at all.
Anyway, at some point, AWS realized they had the power of building their own stack, from the hardware to the software. Their practically invented the concept of “DPU” (Data Processing Unit) and thanks to their huge experience running Xen at the scale of millions machines, decided to experiment things from scratch: building their own dedicated hardware to get the most of it, with a custom software stack on top of it.
Funnily enough, writing an hypervisor isn’t that hard if you know on which hardware it will run. What make Xen relatively complex is the fact it can run on mostly all x86 platforms in the world. Believe me, it’s really HARD. Hardware is buggy as hell, and you constantly need to address various bugs, from the hardware itself or the BIOS/UEFI or fimrware. It’s a nightmare. Linux is full of workarounds too.
But with that opportunity to get your own hardware, AWS started from a blank page. Nitro, their hypervisor, isn’t your usual KVM. It’s a very stripped down version to the minimum with probably more specific code than existing KVM code, to deal with their dedicated hardware accelerators. I think their design is really great, but don’t expect to find this in your distro with your regular hardware. That’s why you can’t say that “AWS switched from to KVM”, it’s simply not true.
Not true also because they won’t replace all their instances with Nitro. Most of their fleet is probably still running on Xen, because it’s there, it’s secure and it’s compatible with any machine they can afford on the shelf. So AWS just added a new solution in their stack, they did NOT replace it.
KVM success is undeniable. There’s multiple reasons to that: some are technical (but there’s a cost for that, see below), some are due to the main players.
Since Xen was the first Open Source hypervisor, it could have been the unique player in that area. But it’s not the case today, since you can see a lot of KVM deployments all around the world. But why KVM just exists?
Because when Citrix acquired Xen Source Inc. (circa 2007), everyone was afraid. Citrix doesn’t have a great track record with Open Source, since their core business is virtual desktop and working with Microsoft. Imagine you are RedHat at this time: you are Linux kernel champions, and then a hypervisor made by Citrix is going everywhere. So you react and you create KVM, a module on top of Linux kernel (because you know it very well) to do virtualization. So you are not dependent of Citrix doing shit with freshly acquired software.
Also, as you are Redhat, you are pushing for it everywhere. After all, you are the number 1 company in Open Source and Linux. So you integrate it within your product so it’s easier to use in the server virtualization context. On the other side, Citrix is in fact not caring for server virtualization market, despite having a great product called XenServer. Because remember, Citrix doesn’t care about server virt market, they are all in Desktop virt market (more margins/$$$ and also their core business after all).
That’s where KVM started to get more traction.
The other reason is technical. Since it’s integrated/pushed by RedHat, people started to rely on it, but also contribute to it. From a technical perspective, KVM is far more permissive than Xen. So you can do fun stuff relatively quickly, like virtio and such. However, this comes at the cost of less isolation due to a less “isolated” model/architecture than Xen, which is a “true Type 1”. Xen design is closer to ESXi, KVM is more “open bar”. But this aspect was also part of its success: simplicity vs security, in terms of adoption and development. I’m not saying KVM is not secure at all. It’s a different design, which is inherently less secure. Is one better than the other? Hard to tell, there’s no silver bullet.
In my opinion, that’s not what matters in the end. What matters is integration and a product easy to use. “We” have an edge on the security aspects with XCP-ng, but we have more work to develop new features.
Another aspect I love with Xen is it’s size. It’s not a perfect metric, but to give you an order of magnitude: Xen entire code base is 2% the size of Linux kernel in terms of lines of code. This allow very interesting thing:
- As a true Type 1, you boot first your hardware on Xen, which is a kind of micro-kernel in the end. It means the attack surface is REALLY low
- Since it’s not that big, you can actually read the whole code base and understand everything. This is a fantastic way to discover what an hypervisor is, and then move further to improve it
- And also the Xen Project isn’t too big either, meaning you can actually make things moving faster if you invest into it, vs Linux/KVM contributions, trusted mostly by IBM, Google and such.
This is where it matters. Here at Vates, we are committed to become one of the biggest maintainers for Xen Project. We invest a LOT of R&D into it, because yes, there’s a lot of things to do! That’s why our XCP-ng dev team doubled since last year and crossed the number of 10 dedicated devs (already 3 new devs this year!). At this pace, we’ll clearly outnumber even Citrix XenServer team very soon (if it’s not already the case).
That’s also why we are not afraid on anything else that could happen to Xen Project, even if Citrix decide to stop it at some point for whatever reason (it’s really hard to guess their intentions). Xen Project is a fully independent Open Source project. And it found a new welcome home at Vates
I’ve heard you had some heated comments @LTS_Tom . We also had the same in France when someone on Twitter previously using Proxmox, decided to switch to XCP-ng. Some people from Proxmox community went angry about it. I don’t really understand why some people are feeling threatened by other people’s choices
Here at Vates, we deeply respect Proxmox project, since it’s fully Open Source. I don’t have that level of respect against corporate policy from a company leader on the virtualization market, always pushing the pricing to the roof. If you love it, good for you, nobody is forcing you to switch to XCP-ng! (but we respect their great technical level and their engineers obviously!).
Also, it’s important (I think) to understand the difference in philosophy between both platforms. Here at Vates, we want to build the most integrated virtualization platform, a we are 100% dedicated to that. We don’t make any other stuff, like email gateway and such. As I said, our real ambition is to entirely master all the components in the stack, with even being the biggest contributors of Xen Project. I don’t think that’s a priority at Proxmox, since they are mostly integrating KVM and other open source technologies (it’s fine! it’s just not the same perimeter/investment).
Another difference of philosophy is to be truly open, not just the code, but with our community. Xen Orchestra and XCP-ng are directly the result of users feedback. We made our best to simplify user contributions (on Github) but also to work with partners, in hardware or software. We truly want to build an ecosystem. It’s not people coming to us, but truly us coming to different communities (as this very post).
Finally, the other big difference is the technical aspect (beyond Xen vs KVM): we want to provide an integrated experience that doesn’t require any Linux knowledge. That’s why XAPI (API of XCP-ng/XenServer) is great: creating a storage doesn’t require any Linux command, same for network and such. So it means it’s not meant to be tinkered like you can with Proxmox! This is a direct consequence of the KVM/Linux philosophy were the host has a control on the entire system. Which is not the case in XCP-ng, your “host” is in fact your dom0 which is already running on top of Xen.
I hope I answered some relevant questions I’ve seen in the community and in your video.
Notes on other parts of the video to answer some questions asked.
This doesn’t matter as soon you are using PV drivers. When you boot on Linux for example, the emulated drivers are only used BEFORE loading the kernel (ie in Grub), then Linux will automatically switch to PV drivers, so there’s no “emulated NIC”. No emulation at all. So there’s also no NIC speed limitation, it will be limited by the PV drivers (mostly by the CPU speed of your host due to the Xen PV calls).
e1000 only matters if you are on Windows and you don’t want to use any PV drivers (maybe because you are still running very OLD Windows versions).
If all your VMs are in the same pool, there’s a automated mechanism that well reduce the CPU features to the older CPU, so there’s no issue to live migrate VMs between those hosts.
If you want to migrate between different pools, then always migrate from oldest to newest CPUs (more instructions on the destination isn’t a problem vs the opposite). Indeed, if you do that and your VM booted with some available CPU instructions, if they suddenly disappear on the destination host, the guest will crash when trying to call them. Because they don’t exist!