After reboot PC node did not start. File is not a database

Alexey · June 16, 2023, 1:25am

I can assume, that all corruptions happened to nodes, working in VM, when their data was on the passed partitions from the host (without using of virtual disks). Perhaps this is a root cause.
Could you please try to use a virtual disk instead for at least databases?

I do not trust a software virtualization such as QEMU too much; when I use Linux as a host, I usually use KVM instead.

Jice · June 16, 2023, 9:38am

“no killing the node process give it as much time to shut down as needed.”
If we talk about docker this is “docker stop storagenode?” or there is something more “gracefully” ?

JWvdV · June 16, 2023, 3:37pm

For sure, because i didn’t use any virtual hard disk. Because you’re depening on 2 filesystems in that case: the host file system and the guest file system. But can give it a try in a few days to see how it works out.

Jup, but usually it’s KVM/QEMU (as is in my case). I haven’t seen happen so much, that people run KVM standalone without any (para)virtualization. As far as I’m aware, libvirt/virsh/QEMU/KVM is all on the same page. See for example KVM - Debian Wiki. KVM is just the hypervisor QEMU uses in order to get hardware acceleration, opposed to full software emulated (Tiny Code Generator) in order to be able to run foreign instruction sets. If you ran it otherwise, please enlighten me. By as far as I know, KVM is always run with KVM.

Any help, how to configure the system in order to prevent kills of VM’s?

Alexey · June 17, 2023, 1:22am

Use KVM as a backend, no jokes. At least I never meet such a problem, when I used it.
And passing partitions become rock solid unlike QEMU. Your load is not testing/developing for what QEMU is designed for, so use a normal hardware virtualization instead of software virtualization.

JWvdV · June 17, 2023, 7:01am

Show code please.

Because QEMU is using KVM under the hood, and there are no KVM CLI-utilities as far as I know aside from QEMU. That was the reason I referred that KVM page from Debian.

root@VM-HOST:~# virsh dumpxml Storj-node
<domain type='kvm' id='2'>
(...)
</domain>

root        6158  0.0  0.0      0     0 ?        I<   jun15   0:00 [kvm]
root        6159  0.0  0.0      0     0 ?        S    jun15   0:00 [kvm-nx-lpage-recovery-6153]
root        6164  0.0  0.0      0     0 ?        S    jun15   0:58 [kvm-pit/6153]

Alexey · June 17, 2023, 7:10am

You may install QEMU without KVM.
I usually install it like this

sudo apt -y install qemu-kvm libvirt-daemon bridge-utils virtinst libvirt-daemon-system
sudo apt -y install virt-top libguestfs-tools libosinfo-bin  qemu-system virt-manager

and manage VMs via virt-manager or virsh. I did use Ubuntu though, but I think it doesn’t matter too much.
Many of SNOs use Proxmox without issues with db too.

JWvdV · June 17, 2023, 7:14am

So, I’m doing exactly the same and you’re using QEMU too

As I said:

The only difference is what hypervisor you choose in QEMU. Which by default is KVM.

arrogantrabbit · June 17, 2023, 7:23am

Unless you are passing the hardware disk device, the disk management is running on the host. It’s a problem.

See Alexey’s commend below on QEMU. Any reason why you need virtualization in the first place? You can switch to docker (or podman) and run containers rootless if you want better security; but the containers and will share kernel and you will avoid file locking issues, and likely your issues will go away.

Yeah, I think it is what I think – the fielssytem is managed by the host and VMs get to access it. This is the root of all evil.

Alternately, you can just keep databases on the VM’s disk. Then VM’s kernel will take care of correct locking and promises.

if you pass-through the USB device (not filesystem!) to VM – it will also work.

But seriously, get rid of so many VMs. use rootless podman containers instead.

Syncthinkg is replication. For backup you want point-in-time snapshots and versioning.

Imagine you corrupted a photo and it diligently replicated eveywhere, replacing good copies.

This addition makes it somewhat suitable for backup. But just 3 months is too little. you can’t be sure you’ll detect corruption in such a short time. Use actual backup tools, no need to reinvent a wheel. But while I can speak about data backup for hours, this is off topic

do block device write atomically? VM does not know if blocks are actually written by the host, there is still a windows of opportunity for corruption.

But the difference is not many people construct such a complex multilayer system with drive sharing and VMs (with a few suspect points of failure as outlined above) and as a result don’t have issues.

As an experiment, switch to containers from VMs and see if you notice any improvement. This should be fairly easy to do.

arrogantrabbit · June 17, 2023, 7:25am

TLDR – let’s either get away from QEMU/KVM/etc OR passthrough the PCIE SATA controller to the VM. You want to remove the middle layer the can screw up atomicity of your transactions.

Alexey · June 17, 2023, 7:37am

I would support that. I did use a whole drive connected to VM, not partitions, but not for nodes though. And usually you have a better results with virtual disks than with passed partitions…

Jice · June 17, 2023, 10:16am

so the most gracefull stop for nodes is docker stop storagenode?

Alexey · June 17, 2023, 11:13am

docker stop -t 300 storagenode

for precise, but - yes. You also need to remove a stopped container as well

docker rm storagenode

donald.m.motsinger · June 19, 2023, 2:07pm

Frequently? Really? I run multiple nodes since when v3 started and AFAIR never have had a corrupted database.