Start up piece scan custom schedule

snorkel · October 6, 2024, 4:29am

It would be useful a user defined schedule for running the used space file walker, if you don’t want it to run with every restart, because it takes too long.
I’m thinking something like a user defined interval and user defined conditions:

run it once in: x days (1-365).
run it once in x days, if the allocated free space reaches zero.
run it after every restart: true/false.
run it after every update: true/false.

The history is kept in the log, as long as you don’t remove the log or remove the container. You can have multiple triggers.
I imagine a custom config like:

run it once in 180 days.
run it once in 90 days if the allocated free space reaches zero.
run it after every restart: false.
run it after every update: false.

arrogantrabbit · October 6, 2024, 4:40am

Why not disable it and rely on lazy file-walker?

Alexey · October 6, 2024, 4:53am

It’s pretty easy to trigger that via cron, all you need is to enable the scan in the config and restart the container/service, then disable the scan in the config.

For example with this script:

#!/bin/sh

sed -i 's/^.*storage2.piece-scan-on-startup:.*$/storage2.piece-scan-on-startup: true/g' /mnt/x/storagenode2/config.yaml
docker restart -t 300 storagenode2
sed -i 's/^.*storage2.piece-scan-on-startup:.*$/storage2.piece-scan-on-startup: false/g' /mnt/x/storagenode2/config.yaml

The next restart of the container/service would run the node with the disabled scan.

snorkel · October 6, 2024, 7:41am

The lazy walker created all sorts of problems. It stays off forever.

snorkel · October 6, 2024, 7:42am

This is still a manual management. You have to check logs regularly to see when the scan is finished, than modify the config.

Alexey · October 6, 2024, 8:29am

Why do you need so?
From the description you want to have a schedule?
If not, ok, you may extend it to capture when the error is occurred and restart the process, but now with the lazy disabled, not only enabled scan.

I do not see, why is it not possible to implement with scripting?
This is a completely local issue, not much of setups want this customization, so why is it should be implemented in the code?
Especially when you have an option to provide the whole disk for the node?
See

Roxor · October 6, 2024, 10:29am

That new feature will be so nice for whole-disk nodes. Never have to run used-space-filewalker again, ever? Sweet!

arrogantrabbit · October 6, 2024, 2:56pm

What problems? And if so, it would be better to fix existing problems if any, than to write more new code.

No, filewalker is a separate process. You can easily script the check.

But I still see no point in jumping through hoops. What exact problem are you trying to solve?

This is not a problem that needs solving.

snorkel · October 6, 2024, 4:18pm

The obvious problems:

automatization for unattended nodes;
discrepancy between real used space and reported used space, when then are bugs, like we had in the past, and maybe have some.

Why do I have to explain it; we have these discussions in the past 2 years in many threads.
I propose a solution for small setups, with limited RAM (like the big majority of nodes), that are tired of runing endless file walkers. What’s so unclear? It’s not for servers with 128GB RAM and dedicated SSD caches.

Alexey · October 7, 2024, 4:07am

Then I would suggest to implement a script which would do any logic, which you like and run this script with any schedule which you think is a best for such setups.

However, if we are talking about unattended nodes, then perhaps it’s better to use this new feature for the whole disk/partition share and disable the scan on startup. The stat on the dashboard could be wrong, but it wouldn’t affect the ability to accept data, if there is a free space on the disk.

If you think that lazy mode is not good for some reason (for example - it’s a VM, so the host is not aware of any low IO processes in the VM), you may disable it, but I would also suggest to enable a badger cache to handle all other filewalkers in that case.