Avg disk space used dropped with 60-70%

This is very often happen when the node is stopping. I would suggest to search for FATAL errors in your logs around this time. The other problem could be if the disk is very slow (SMR, network connected or virtualized or using USB and external disks).

This is usually only happen when databases are locked, corrupted or not updated because of failed filewalkers.
So here you need to search for errors related to databases in your logs and also to check databases themselves:

Thank you for the hints, I’ll research what might be wrong with the disk. But to buy me some time is there a way to manually remove at least some garbage from the disk before it runs out of space?

1 Like

Unfortunately not. It’s very dangerous to touch data of the node - it could be disqualified as a result.
It shouldn’t run out of space if you would set the allocation below the usage, this would also stop any ingress to your node as well reducing the load.

I made a similar assumption yesterday and changed configured storage size from 580GB to 510GB. It slowed things down but didn’t stop it completely. At the moment it shows

Total Disk Space   510GB
Used   505.13GB
Free   4.87GB

Already less than 5GB free reported but still the logs show active uploads.

You need to check the space in SI units

df --si -T

However, if some deletion would happen, your node would reporting the free space to the satellite again and will happily accept the traffic until it hit the threshold again.

I believe you need to enable a scan on startup if you disabled it and wait until databases would be updated with an actual usage keeping the allocation below the usage showed on the dashboard.

Right and I believe Storj Dashboard uses SI units. This is reported by the dashboard alone irrespective to what the filesystem reports:

File system reports Used space 610 023 081 512 bytes that is 610GB.

I have it enabled since yesterday:

It hasn’t helped much so far.

1 Like

Did they finish for each satellite already?

Some failed with:

$ cat '/cygdrive/c/Program Files/Storj/Storage Node/storagenode.log' | grep used-space-filewalker
. . .
2024-06-11T15:53:45+04:00       ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "error": "context canceled"}

So I guess it doesn’t like the speed of the disk. But the disk is idle at the moment, is there a way to force it to run even if slowly?

Another question is how to get a list of blob file paths the node knows it manages? By comparing it to the filesystem I could identify and remove the files the node lost track of thus freeing up that 100GB of files lost by the node in my file system.

I think that instead of doing so much testing, it would be better to solve errors that are repeated every month, such as:

5 Likes

Hi all,

I have the behavior for a few months now that my Average Disk Space Used drops within the month even if a lot new traffic is incoming and reported used space increases.

Anyone an idea what to do?

That TBm graph is a cosmetic issue: it doesn’t affect payouts. I can wait for a fix.

8 Likes

Yes, it is very weird, when used space increases and at the same time average used space goes down.

1 Like

You may disable a lazy mode, it will work with a normal priority and perhaps would be able to finish.
However, if you also have a FATAL errors around this time, this likely a consequence of killing the node.
The simple restart of the node will also restart a used-space-filewalker if it’s not disabled on start.

This will disqualify your node for sure. I wouldn’t recommend to do so. What you can remove only files in the temp folder older than 48h.

Not weird, it’s a monthly average. Since there is a lack of data from US1 and SLC satellites on a few days, the average is lower, than it should. When gaps would be filled, the average would recalculate too.
Our engineers are working on to fill these gaps. SLC already filled for several past days, but data would be updated on your dashboard after roughly 12 hours since update. You may force update if you would restart the node, but I think it’s not urgent and you may simple wait when the graph would be updated automatically. US1 is not filled yet.

Well it is. When used space goes up all the time for the month and monthly average goes down all the time.
As you said, it is a monthly average and should follow the trend that used space is setting.

No, not urgent. It is just again that we cannot rely on the data provided. This has to change.

Edit: That is also the kind of info that I meant to make a dedicated place for: Improve notifications relevant for node operators

2 Likes

For that I strongly recommend to use Grafana:

Running chkdsk d: /f twice fixed some file system issues and Storj node log is now clean from error messages.

That’s what I was trying for some time but only after I turned lazy mode on again it seemed got better. Not sure if I just overlooked something.

The node made some progress and collected trash for 1 satellite by now. Average space used chart got better too.

Overall it took me long time to just start understanding how nodes to satellites data synchronization works. May be it’s just me not reading documentation and FAQ but when I ran into Disk usage discrepancy? - #61 by Alexey it felt the matter I missed badly.

Well this bug need a permanent fix. Every month I see a few topics like this. It’s confusing for SNOs. And also a waste of resources to handle those topics. So please issue a permanent fix.

7 Likes

Still not working. Is there any further progress on that topic?

As you can see, there are around 2TB missing. In Payout they are still missing too. And the drop in june 7 is still there.

Do I need to restart something?