Disk usage discrepancy?

Alexey · July 5, 2024, 7:06am

The solution is simple - restart the node. If the scan on startup is enabled and you do not meet “database is locked” during the way, the stat should be updated.
However, if you formatted the drive with a big cluster size (it depends on the size of the drive >=0 and <16TiB it’s 4KiB, >=16TiB and <32TiB it’s 8KiB, etc.), this will not change the math, you would see that data used e.g. 1TiB, but took e.g. 2TiB on the disk.

HGPlays · July 5, 2024, 7:25am

Thanks alot for helping.
seems cluster size is 4k?
is that as it should be?

Alexey · July 5, 2024, 8:39am

Yes, seems so. Now you need to check everything else from the suggestions.

daki82 · July 5, 2024, 11:31am

I did this, now its working somehow.

Ring_Zero · July 5, 2024, 8:58pm

I still have the disk usage discrepancy.

*Dashboard
Avg Disk space Used this month = 5.49TB
Used = 2.56TB
Trash = 0
OverUsed = 0

EXT4 Filesystem = 7.7TB Used

snorkel · July 6, 2024, 2:37am

So funny how some languages soud:
Kommando prompt.
@HGPlays Your drive supports fastformat to 4K sectors, because physical they are 4K, but logical they are 512bytes. Windows just reformats the logical sectors to logical clusters back to 4K. I am not an expert, but I believe it was better to have 4K physical and logical sectors.
But in order to fastformat the drive, all the data is lost. Maybe next time when you buy new Seagate drives, you fastformat them to 4Kn, before you put anything on them. They offer a bootable stick with some linux utility software. Let me check my notes, and I’ll get back to you.

Alexey · July 6, 2024, 10:35am

Do you still have errors related to databases/filewalkers? If yes - then it need to be addressed.
Because the update would happen only in these conditions:

The used-space-filewalker is enabled (it’s enabled by default)
You restarted the node
All filewalkers are finished for all trusted satellites successfully (and updated the database without ANY error)
You did remove the untrusted satellites data
You do not have errors related to a databases (neither “malformed” , nor “not a database”, nor “database is locked”).

These are final and mandatory requirements.

Ring_Zero · July 6, 2024, 3:21pm

I did not check for errors, where do I look for them. Apologies, I spend minimum time on my node. Yes, I faced a power failure recently and my UPS decided to fail on me at the same time. I did bring up the node but not sure if it is in a consistent state as far as data goes. How do I ensure I have valid data and all other crap is removed.

snorkel · July 6, 2024, 9:24pm

Does the ver 107 solves all trash bugs? Or are we still investigating a new bug?
I lost track of trash problems, and I want to run startup piecescan to correct the dashboard, but I’m waiting for the version that solves all bugs about trash.

Alexey · July 7, 2024, 7:41am

We do not know yet, but we hope, that it’s addressed all of them.

Vadim · July 7, 2024, 10:07am

i checked All 4 used space filewalkers are completed successfully, but node trash is as was wrong and stay wrong. today it has 1.48TB of trash and in trash folder there is only around 500GB real files So filewalker not fix problem at all.
So this arround 1 TB air buble stayd the same. nod show it is full, but on disk there is 1.12TB free space. 4TB disk. So after weekend will over, if you need some data from me(logs, Db, remote access) to address this issue as fast as possible.

snorkel · July 7, 2024, 11:19am

Maybe the startup piecescan dosen’t scan the new trash folder, how it is structured today; or it dosen’t update the databases.
And maybe the TTL deletion dosen’t properly update the databases too. These are the main factors that I can think of.
If this still isn’t a priority for Storj, I don’t know what are we all doing here.

Alexey · July 7, 2024, 1:29pm

It has a priority. The team is working to change the way this information is processed.
I think they would figure out something more robust than the current system, but as always, it takes time. Right now nobody here (US holidays + weekend), but I hope we would have some updates after Monday.

snorkel · July 7, 2024, 1:36pm

I just checked my nodes and the discrepancy between dashboard reported occupied+trash space and the DSM reported occupied space increased significantly. A week ago it was like 700GB, now reached 3.4TB difference for the 22TB drive. Dashboard says I have 21TB of data. DSM says 17.6TB. All processes finish, like collector, trash, retain. I’m not sure if activating the startup piece scan will help, but I will wait for next versions.

snorkel · July 7, 2024, 4:06pm

I got an ideea to check my logs about “database” and of course the bandwidth.db is locked.

ERROR   orders  failed to add bandwidth usage   {"Process": "storagenode", "satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "action": "GET_AUDIT", "amount": 104448, "error": "bandwidthdb: database is locked", "errorVerbose": "bandwidthdb: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:76\n\tstorj.io/storj/storagenode/orders.(*Service).SendOrders.func2:254\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}

The databases are on USB 3.0 SSD flash drive, but I disabled “Enable delayed allocation for ext4”. Bad move. Didn’t had a problem before, but I thought can’t be such a bad thing and protects the db-es in case of UPS failure. I’ll switch back to Enabled, hope will solve the issue.

Alexey · July 8, 2024, 6:12am

Are you sure that DSM used SI measure units?
If not, then 17.6TiB is 19.35TB. However, I see some discrepancy anyway. On 1.105.x there is a bug with updating databases with deleted TTL data, it’s fixed in the next release. So you need to restart the node with enabled scan on startup to update databases and reduce a discrepancy. Or as you said wait for the new version, however, it will require to run the scan at least once to fix a previous discrepancy anyway.
So, you may enable it in config.yaml but do not restart the node. It will be restarted automatically on the next update.

HGPlays · July 8, 2024, 6:21am

Thanks. but in all honesty the SNO’s should have to be dealing with this in the first place in my opinion - if storj says “nfts” is the windows goto filetype then it should be working and not misusing our space like crazy. just my opinion.

snorkel · July 8, 2024, 6:28am

Yes, what I posted is converted in Si units. My SSD flash drive couldn’t handle the traffic as well, along with that bug… so a bunch of causes for that discrepancy.

Pascal51882 · July 9, 2024, 6:12am

Hello
my node has become very full thanks to the changes made by Storj. That’s great! Unfortunately, I now have a very large difference in the dashboard and the physical hard drive.
Dashboard 5.4TB used in total. Physical hard drive without other content 11.5TB. The drive is XFS formatted and has 12TB. I never had such problems on other systems.
Another node with ext4 is very close to the dashboard value.
The ext4 node has a newer version on the same system.

What could be the reason?

System:
Unraid 6.12.10
i3 CPU
8GB RAM
WD 12TB XFS v1.105.4
WD 10TB ext4 v1.107.3

blanaru · July 9, 2024, 6:41am

Qq here: how lazy is the lazy filewalker? I mean for a node restart, I believe it is starting all over again, right?
As for lets say 8tb data on disk, how many days without interruption it takes to ckean up trash and discrepancies?
After almost a month uptime, my nide shows correct my 13tb data, but after a restart it shows only 8tb and was curious why is it acting like this.