the nutanixist 01: the deeply misunderstood AHV

July 3, 2025 by kostadis roussos 3 Comments

Eleven years ago, when Nutanix announced AHV, my initial reaction was: This will fail.

The idea that someone could successfully introduce a new commercial hypervisor into the market seemed ridiculous.

But 11 years later, it has proven me wrong.

Despite this, AHV remains deeply misunderstood because of its uniqueness.

Most operating systems are stateful, meaning that the system’s state is stored within the OS, and when the server restarts, the OS retrieves its state from disk.

AHV, however, is stateless, and so?

Consider a VM. For example, with ESX, you can create a VM through the ESX console, and if ESX crashes, the VM’s state is saved on local disks that ESX reads to restart it.

Much of vCenter’s job is to monitor what’s happening in ESXi and respond to an environment that doesn’t match its expectations of what ESXi was doing.

With AHV, creating a VM requires accessing the cluster control plane, which runs on the CVM—a special VM that manages the cluster’s state. For more details, see The Nutanix Bible.

Thus, when the OS boots, the cluster control plane determines its state.

Why is this so powerful?

During a reboot, the control plane doesn’t need to determine what AHV considers to be running, nor does it need to stop or start processes.

More importantly, the AHV state can’t be out of sync with what the control plane believes.

This setup also significantly simplifies everything. Systems like ESXi or Linux require building a layered control plane on top of the OS’s control system. This layered system must interpret and respond to the actions of the underlying control plane. If it needs to stop or change something, it is at the mercy of the underlying controls.

Most of the time, this isn’t an issue because complex software, such as kubevirt or hostd, tries to reconcile conflicting control goals.

With AHV, that intermediary layer doesn’t exist.

This leads to the misunderstanding: AHV is stateless and can’t be directly compared to ESXi or Linux/KVM. Instead, you should compare it to ESXi + vCenter or Linux with something like OpenShift.

When you make this comparison, you realize that AHV provides a level of availability and control that’s unmatched.

For example, vCenter manages multiple clusters, and any single cluster can impact all workloads. If vCenter fails, it affects all workloads, necessitating the use of numerous vCenters, which in turn leads to vCenter sprawl and increased operational overhead.

With AHV, the boundary is a cluster. Each cluster is isolated from the others.

Thanks to Prism Central, it’s more feasible to run multiple workloads on separate, isolated clusters.

Because the AHV control plane is clustered, its availability is tied to the availability of Nutanix clusters, unlike vCenter, which runs as a single VM.

All of this is possible because AHV’s stateless design allows the creation of a cluster control plane with a small team, achieving something once thought impossible.

Trackbacks

the-nutanixist-02: the problems with stateful hypervisors says:

July 6, 2025 at 3:58 pm

[…] I recently wrote about how AHV is deeply misunderstood. And what struck me is how deeply misunderstood AHV, ESXi, and Openshift are.ESXi is amazing software with a dedicated team and satisfied customers. Its architecture enables continuous operation of VMs even if hosts become disconnected, thanks to a local control plane on each host that manages registered VMs independently, relying solely on storage. However, this control plane maintains state, so if it fails or becomes unavailable, ESXi becomes inaccessible even though VMs continue running. The control plane relies on user-space services, such as hostd and vpxa, which can fail for various reasons. This architectural design, while effective, has inherent limitations. Notably, this architecture serves as the foundation for every other commercial Hypervisor, except AHV. Openshift faces similar issues, due to kubevirt and kubetcl.I will now focus on the architecture rather than the products.When you build a clustered system, you are creating a clustered control plane. And you have two design choices. One is to build on top of a local cluster control, and the other is to build directly on the data plane.If you build on the local cluster control plane, you have two challenges. The first step is to detect any actions taken by the local cluster control plane and reconcile those actions. The second is that there are plenty of scenarios where the VM is running but the local cluster control plane is down, and can’t be recovered. And so, it becomes very tricky to determine whether a host is up or down. Because, as the cluster control plane, you don’t know if the VM is running or not.A clustered control plane is almost always running in a split-brain mode, where it hopes that it knows just enough of the local state to make the right decisions, and it expects the local control plane won’t make decisions that break it. Not being able to determine the state of a host deterministically, and whether it is up or down, makes the system fragile. Why? Because the host, while “seemingly” being down, can be up. And while it’s in this disconnected mode, the host can be modified. At the same time, the cluster state can also change. When the host rejoins the cluster, a human must reconcile a state that cannot be reconciled. Although this happens infrequently, it can’t be guaranteed, so all this complexity exists solely to handle issues that arise from the basic guarantee of stateful hypervisors.So, are clustered control planes good, and are stateful ones bad? No. That’s simplistic. However, if your system relies on a clustered service, such as HCI, or if the applications are clustered, or if you have modern workloads that require interaction with the underlying control plane to operate, a clustered control plane is necessary. And if it is essential, then the choices for building it and the implications of those choices matter. […]

Reply
the nutaxinist 13: x86 virtualization may not be what you think it is, bare metal is roaring back and why you need a different platform like AHV says:

July 24, 2025 at 1:37 pm

[…] AHV special. And although I talk about the database here, it’s not just about the database; it’s also about the kind of OS and the availability models of that system. One of the more enduring mysteries about x86 […]

Reply
the architecturalist 63: nutanix was the correct answer says:

September 13, 2025 at 12:03 am

[…] What Stargate guarantees is that the cluster knows who is connecting to which disk, and it provides a point of control for the disk. It is not possible to change the state without Stargate knowing. And so, for a cluster, Stargate can prevent anyone from accessing disks and assert who is accessing them. The second is AHV, which, when it reboots, doesn’t remember what it was doing before it reboot… […]

Reply

Share this:

Like this:

Trackbacks

Leave a Reply to the architecturalist 63: nutanix was the correct answerCancel reply