Disk usage discrepancy?

Alexey · July 27, 2024, 3:23am

6 posts were merged into an existing topic: Avg disk space used dropped with 60-70%

Ring_Zero · July 30, 2024, 1:52pm

I still have the disk space discrepancy, any suggestions?

Filesystem Size Used Avail Use% Mounted on
/dev/sda1 15T 9.2T 4.6T 67% /mnt/storj

Dashboard:
Total: 15.07TB
Used: 2.83TB
Free: 11.2TB
Trash: 1.05TB
Overused: 0B

Alexey · July 31, 2024, 8:56am

Absolutely the same suggestions:

Forget untrusted satellites using the --force flag: How To Forget Untrusted Satellites
You need to enable the scan on startup, if you disabled it (it’s enabled by default)
Restart the node
Periodically check your logs for errors related to the databases and/or filewalkers (search for error and database, error and filewalker)
If you have errors, you need to fix them, otherwise usage will remain the same or reset on the restart.
Wait for several days (maybe weeks - depends on how slow your disks are responds) for finishing of the used-space-filewalker for each trusted satellites, to track you can search for used-space-filewalker and started|completed in your logs.

Nokotopulele · August 5, 2024, 5:40pm

HDD is internal. I looked for some days, HDD is always 100% at anytime.
I have reduced size to 10Tb but filewalker still doesnt work.

How can I solve this 100% HDD ? Or maybe its 100% because of filewalker not working

donald.m.motsinger · August 5, 2024, 6:23pm

By letting the filewalker finish.

No, it’s 100% because filewalker IS working. It might take several days for it to finish. When you keep restarting your node it will start again from 0.

Alexey · August 6, 2024, 7:52am

2 posts were merged into an existing topic: Avg disk space used dropped with 60-70%

Nokotopulele · August 6, 2024, 3:29pm

It never finishes, it can run for months, always return ERROR.

I dont know about you, but it looks like the bigger is the HDD, and the more data there is, the more it fails.

Not my first node having this problem. Every time I reach 10Tb and more of data, filewalker starts to fail. Even if the HDD is brand new, CMR, optimised as fuck, still happenning sooner or later.

naxbc · August 7, 2024, 10:04am

Yup…same here. I even set available space to 600GB when actually it’s a 16TB HDD; still fails, and on one of the Windows nodes it’s been running for over 1 week!

batelis · August 7, 2024, 10:25am

I got few nodes and one with about 5tb of used space. Everything is done after like 7 days and disk usage drops down from 100%, other nodes with 12tb of used space couldnt be completed cause these updates coming in like crazy, devs doing too much work at this point maybe.
So your node probably needs like 14days to complete all filewalkers, but if you are getting errors you probably got other problems in your system, already saw Alexey explaining how to fix that in like 10000forum threads here. Probably he meantioned what to do here aswell.

naxbc · August 7, 2024, 2:09pm

Yeah, I had no fixes to perform: databases are ok, on an SSD, no disk errors, cache mode enabled (I have UPS)…filewalkers just fail with exit code 1.
I’ve tried with lazy mode on, off, scan pieces at startup and not…I quit.
All the supposed lines in config.yaml are commented (as default) and I just let them run until Devs just fix this once and for all, or until downloads delete all the data I have until it reaches 600GB total…and then I’ll just make available storage 15TB.

Alexey · August 9, 2024, 4:41am

8 posts were merged into an existing topic: Avg disk space used dropped with 60-70%

asturking · August 7, 2024, 10:26pm

can saltlake be forgotten for nodes that are not yet 6 months old?
I am thinking of doing gracefull exit

Alexey · August 8, 2024, 8:36am

@Nokotopulele @naxbc
Then you need to disable a lazy mode, because it cannot be handled properly in your setup.
Please, enable the scan on startup (it’s enabled by default), disable the lazy mode

Alexey · August 8, 2024, 8:38am

Right now it’s not possible to call a GE, if the node is less than 6 months old.
But you may add it to the untrusted list (your node will be DQ on that satellite eventually).

asturking · August 8, 2024, 11:12am

Is there any problem with performing gracefull exit or adding the untrusted list to that satellite other than to stop receiving traffic from that satellite?

naxbc · August 8, 2024, 11:20am

It doesn’t work.
I tried it…keeps failing with exit code 1.

Dr.Ko · August 8, 2024, 4:26pm

curl localhost:5999/mon/stats

Can I check how much FileWalker is currently running with the above command? The discrepancy in nodes with small capacity has been resolved, but the discrepancy still persists in nodes with large capacity that changed from 14T to 6T, so I would like to check when it will end…

I would like to post the results of my command, but there are too many search results, so I would appreciate it if you could tell me what to check.

littleskunk · August 8, 2024, 5:36pm

Try /mon/ps instead. That shows you only what is currently active.

Alexey · August 9, 2024, 4:30am

Please make sure that you saved the config and restarted the node.

Alexey · August 9, 2024, 4:33am

Your node would be disqualified in the second case, the first one would require to keep the node online for the next 30 days, and after that the node wouldn’t receive any ingress from the customers of that satellite. In both cases, there is no turning back for this node.