the nutanixist 26: the unbounded step – how desired state systems create dual-database drift

After I wrote about single databases and control planes, and their importance, folks observed that OpenShift and kubevirt use a single database.

And what I realized was that there was a deeper point being made: the Nutanix system guarantees that the host is under the control system’s authority, or it will automatically stop running.

The complex challenge of ensuring that the host runs only what you want is the first step, but the next — making sure it runs only when you can control it — is even more fundamental.

If the host doesn’t stop running automatically, the VM is both running and not running for some period of time. During that time, there are effectively two databases.

Because manual intervention is necessary, a third database comes into play: the operations database that detects this condition, and the system that actually reboots the machine or tracks its reboot.

The Kubernetes desired state system allows for this manual intervention by essentially waiting for the state to converge.

Desired-state systems enable these manual procedures, but the drawback is that they permit an implicitly unbounded step. Another system must address that unbounded step.

What makes this -reboot- more complicated is that, for stateful services, rebooting the host is expensive. The service may experience downtime or degraded performance. So when the reboot will happen—or whether it will—is unknowable. So the control plane that wants to take action on the VM or machine in this quantum state will have to stall.

Because rebooting a stateful service is not free, relying on the control plane connectivity to make the decision is fraught with operational challenges. Do you trust the control plane’s correctness more than the database or the OS that the control plane is running on? In other words, is the connectivity failure due to the control plane itself, or something else?

More specifically, is the problem in kubernetes? Or is the problem in the underlying host and VM? To reboot the host automatically, you have to believe that the control plane, in this case, kubernetes, is more reliable than any single host.

Brian Oki, a colleague of mine at VMware and the author of the original Viewstamp Replication, made the point to me early on, and it took me a while to appreciate: guaranteeing that the host is only running when it is connected to the control plane is a tough, but essential, property.

Share this:

Like this:

Leave a ReplyCancel reply