All week chasing VMware cluster problems. It'll now roll onto Monday. It always sucks when things fail in ways so subtle the fault tolerance doesn't catch it and you need to start pressing buttons.
@feld Ha! Sorry dude.
@sullybiker Ceph has the opposite problem: if a node sneezes it starts recovering data immediately unless you go out of your way to tell it not to.
A hypervisor has had some weird storage failure where a couple of devices have just vanished, but the VSAN stuff hasn't properly accommodated this fact, so it's just sitting there protesting.