Pretty high activity on node after migration

surprise00 · September 29, 2022, 4:39pm

Hi all,
faced some strange “issue”, yesterday i migrated my node to other location with other hardware(just reinstalled hdd into the new system).
Old system had 2 CPU cores and 4gb ram.
New system has 1 CPU core 2ghz allocated, and 1.5 gb ram, according to docs it should be enough.
Databases relocated to m2nvme drive.
but for some reason for the last 24 hours after migration i see that drive is under high load, continiusly reading at 11-16mb/s speed and have in average 300 IOPS.
Can you explain me why? on my old location i had such situation only around first 5 minutes after system reboot.
Is it some kind of activity from satellite side because of relocation or?

Knowledge · September 29, 2022, 6:38pm

Sounds like the Filewalker, it happens when you power up. There is a thread on here discussing tuning it.

surprise00 · September 29, 2022, 7:29pm

Ok, so it seems that filewalker.
But it is not explaining why this process earlier took 5-10 minutes, and now it is running for more than 24 hours.
node is not so big, around 5tb…

can we state that next time this process will take less time or not?

Alexey · September 29, 2022, 7:34pm

Likely it will take the same, unless you add more CPU. Perhaps the SATA controller is more cheap and uses more CPU power.

Toyoo · September 29, 2022, 8:14pm

No special activity is supposed to happen from the satellite. Sounds more like your new system has some kind of a bottleneck your old system didn’t.

Is the HDD the same, or did you copy data to a new HDD? How is it connected? What OS?

surprise00 · September 29, 2022, 8:32pm

Proxmox and under it ubuntu server.
same hdd, just moved it from one system to other one.
old system had intel pentium gold g4560 and 4gb ram
new one j4005 one core and 1.5 RAM
Connection sata, HDD redirected to specific VM inside proxmox

Toyoo · September 29, 2022, 8:48pm

What do you mean by redirected? Some kind of PCI HBA passthrough, virtio, or maybe just a standard emulated SATA controller?

Did you also move the guest OS, or is it new?

What file system? How much data does this node have?

I’d suspect just some misconfiguration, bad interaction between the guest and the host systems…

surprise00 · September 29, 2022, 8:52pm

As you can see disk attached to VM with qm set command. Nothing special.
All other settings are default.

OS new, installed on clear proxmox

surprise00 · September 29, 2022, 8:57pm

Is there any sense to add some RAM?

Toyoo · September 29, 2022, 9:06pm

So, SCSI (scsi2) with the default controller (LSI 53C895A). I guess this should be good enough (though changing to VirtIO SCSI would reduce CPU load). No other ideas, sorry. You can surely play with VM settings, maybe you’ll find something interesting.

Regarding RAM, I was running 12 TB worth of nodes on a 2GB RAM machine and it was good enough. Slowish, but never had a problem like yours.

There are circumstances where two file walker processes (one doing GC, one doing space calculation) compete for resources. You may disable the latter and check whether it helps.

surprise00 · September 29, 2022, 10:21pm

i just cant understand how raspberry pi4 with 4 gb ram, holds 2 full 8tb SMR usb drives with no problems at all… (iops issue go away after 5-10 minutes after startup)

Toyoo · September 30, 2022, 7:07am

Well, there’s clearly something going wrong with this new setup. Only you have direct access to this new setup, so only you can diagnose it.

surprise00 · September 30, 2022, 9:04am

thx for your help, played with settings a bit, and achieved 2 hours for that process.

but it would be cool, if storj launch it once a week or so, cause as i understood it take affect only on graphs in dashboard.

Toyoo · September 30, 2022, 10:28am

The file walker is executed at startup (so at setup, after upgrades and after restarts) for the purposes of calculate disk space used. This is what you can disable.

It is also done every few days, when the satellite sends data for garbage collection. You cannot currently disable it. There was some discussion recently to have this instance of the file walker also calculate disk space, but so far with no actions taken.

It affects graphs, but it also affects what the node reports as free disk space to satellites, and so it is sometimes useful to ensure it’s correct.

Toyoo · September 30, 2022, 7:45pm

Out of curiosity, what did you change?

surprise00 · September 30, 2022, 8:25pm

scsi controller was helpfull, also shared cpu between two nodes.
earlier it was 1 core = 1 vm.
and more RAM.