I’ve been tracking my disk usage for some time, and have just recently noticed that one of my nodes is currently at 122.8% disk usage, which equates to being overused by 2.28TB!
Now I can understand some MB, or even a few 10s of GBs, but 2.28TB is quite excessive, and has filled up the buffer I have on the node so it never gets 100% full.
I haven’t changed my configuration for quite some time, and it’s always been 10TB max allocation, but this is a first for me. Some quick browsing reveals that small amounts are fine, but 2.28TB is quite the “over-use”. Not only is it taking up more disk than I’d like, I’m not even sure if I’m getting paid for that disk usage.
Thoughts on where I can find this and prevent this from ever happening again?
My disks are 14TB disks, formatting, resulting in 12.7TB free.
user@raspberrypi:~ $ df
Filesystem 1K-blocks Used Available Use% Mounted on
...
/dev/sdb1 13563485152 12462425408 417424340 97% /mnt/csrpistorj1
and
user@raspberrypi:~ $ df -h
Filesystem Size Used Avail Use% Mounted on
...
/dev/sdb1 13T 12T 399G 97% /mnt/csrpistorj1
12462425408/1024/1024/1024 = 11.6TB, so it’s not 1:1, but I’m not sure what that has to do with the overused disk. Regardless, whether the OS is seeing 10TB, or the storj application is seeing 10TB, it should halt and future ingress.
You need to use --si instead of -h if you want to compare with the dashboard. The Storj software uses SI measure units (as a disk manufacturers), not binary.
12462425408 = 12.46TB
So, looks like the dashboard shows the used space pretty close to what’s reported by the OS. The average used space, if you would peek the last fully reported day likely has something about 7TB of the confirmed used space. So, the difference is about 5.46TB of uncollected garbage (see When will "Uncollected Garbage" be deleted?).
The overused itself is happened likely when you reduced the allocation to 10TB instead of previous likely 12TB, am I correct?
If not, and you are always have had the 10TB as allocated, then I guess that used space reported by the node was way off until the node is restarted and the used-space-filewalker has been calculated the used space and updated the databases, so now the node is aware that it has an overused disk space.
When the database is not matching the actual usage on the disk, the node wouldn’t report to the satellites that it’s full. Thus it can accept traffic and store more than was allocated.
We have a precaution mechanism though - when the node detect that it has less than 5GB on the disk or in the allocation, it will notify the satellites that it’s full and all ingress would stop independently of the databases state.
The only way to correct the databases state is let it finish the used-space-filewalker. It’s started on every restart.
Hi @Alexey , so the command doesn’t really show much different (Since it’s still a “human-readable” format):
user@raspberrypi:~ $ df --si
Filesystem Size Used Avail Use% Mounted on
..
/dev/sdb1 14T 13T 563G 96% /mnt/csrpistorj1
As for the allocation of my node, I never reduced it. It’s always been 10TB, similar to my other node in a similar setup. (They literally share the same config, except for ports and ddns entry.)
Regarding the used-space-filewalker, I’m guessing it’ll take a while, but I’m surprised that the dashboard is reporting the over-used space… isn’t that pulling from the database? If that’s the case, then the database is already up-to-date.
Yes, it is. What I’m tried to say: some time ago the databases were not updated with the actual usage and node reported to the satellites, that it’s still has a free space in the allocation, where it’s not. And when the node was restarted some time ago, the used-space-filewalker is successfully calculated the actual used space and updated the databases. And now it shows the overusage, which was not visible to the node before that.