Wrong used & remaining disk space

PocketSam · June 28, 2024, 2:33pm

I also see no errors and have no idea how to fix this issue.
How can I force storj to recalculate free space and cleanup expired pieces?
I’ve recreated the container but it does not help. Probably I can recreate some databases to force update their contents but I’m not sure which store corresponding data.

Alexey · June 29, 2024, 10:18am

This error just mean that your node was not fast enough to accept the piece. It’s pretty normal. There are other 80 who managed to accept the piece.
You cannot be close to everyone customer in the world, so kind of normal.

Restart the node (and you need to enable the scan on startup, if you disabled it, it’s enabled by default). And make sure, that it’s not failed (you shouldn’t have errors, related to a filewalkers and/or databases in your logs).

0xDDoS · March 1, 2025, 1:32am

Hi guys, I’ve recently managed to run a 8TB Node on a Raspberry Pi 4, disk space is connected through RAID0 by 3 HDD (2x2TB Seagate Ironwolf HDD 3.5’ and 1x4TB Seagate Surveillance HDD 3.5’) after facing many compatibility problems, however it was running smoothly since more than 48h when suddenly the config.yaml file got corrupted and when I tried to run:

pi@raspberrypi:~ $ docker exec -it storagenode /app/dashboard.sh

the command was returning the following:

Error: While parsing config: yaml: invalid trailing UTF-8 octet

So I restored the config.yaml file I backed-up in another location and now the dashboard works again as before, no node interruptions however has been experienced due to this issue, the Cockpit dashboard and Storj Node Dashboard had always been accessible.
However despite apparently having fixed the dashboard issue, the dashboard now displays a wrong Disk Usage and Free Disk space as following:

Storage Node Dashboard ( Node Version: v1.122.9 )

======================

ID 1***HUp
Status ONLINE
Uptime 52h7m43s

               Available         Used       Egress     Ingress
 Bandwidth           N/A     24.53 MB     24.53 MB         0 B (since Mar 1)
      Disk      -7.78 EB      7.78 EB

Internal 127.0.0.1:7778
External ****:28967

Also the CLI dashboard seems to have the same issue and shows both details wrong same as the shell dashboard.
Am I only supposed to wait some time for the dashboard to refresh and reflect correct data or there may be any additional issues with my node synchronization which I cannot see from the logs?

pi@raspberrypi:~ docker logs storagenode | grep “ERROR” | wc -l

Doesn’t show any relevant error which may lead to the issue, so any hints?

snorkel · March 1, 2025, 6:01am

Maybe more files are corrupted, aka some databases. Did you run a disck check?
Second, you have to wait for the startup file walker to finish.
If these don’t solve it, you can stop the node, delete the databases and run the file walker again.
And… why would you use RAID? The best way to run nodes is one node per disk.

Alexey · March 1, 2025, 12:17pm

Hello @0xDDoS,
Welcome to the forum!

Please redo it ASAP. With one disk failure the whole node is gone. It’s better to run 3 nodes instead: How to add an additional drive? - Storj Docs.

This is mean, that you corrupted a config.yaml file. Please convert it to UTF8 without BOM.

0xDDoS · March 1, 2025, 3:53pm

Ok, I tried to stop the node and run fsck on the RAID /dev/md0 where the node was running and the process got aborted shortly after when fixing some node checksums, and if I try to re-run fsck or e2fsck I receive an error mentioning that the superblock doesn’t match and to try run the process with an alternative superblock, but even by trying all of them none works.
So I’m trying now to run badblocks on the RAID to see if it could fix the fs, in case there is no fix I will unmount the RAID and try running a single node for each HDD.

I was reticent to follow that path initially since I didn’t want to wait for 3 different identities to be elaborated (last time took 37h ) but I guess that it’s the only path that actually makes sense in term of security.

Thanks for your opinions

Mark · March 1, 2025, 4:00pm

You used the raspberry pi to generate the identity? As the setup documents say, you can Create your identity on a more powerful machine and transfer it over.

0xDDoS · March 4, 2025, 12:51am

Yeah, you’re right, however I found it more comfortable to run it on the Pi through Termius on my phone, so I could create 2 identities at the same time without needing to keep my laptop running, since I could put the Termius session in background by keeping it active.

However I removed the RAID0, I re-created ext4 fs for the whole disks and managed to run the additional nodes separated correctly, I also used Let’s Encrypt to sign the SSL certificate and to correctly run multinode dashboard under HTTPS protocol.

And finally this is the result

I’m planning to add another 4TB and 16 TB soon!

0xDDoS · March 5, 2025, 2:39am

It’s more than 24h now that the nodes are running now and there seems to have no issues so far, except for the uptime and audit data of some servers of 1 out of 3 nodes which are reported as following:

By running: docker logs storagenode I cannot see any errors, data gets uploaded and downloaded correctly, the node has never been rebooted since I first started it so why the uptime data shows such low online time for those servers but saltlake.tardigrade.io:7777 has 100% uptime instead?
Moreover only 1 out of 3 nodes has this issue, the other 2 runs smoothly with no errors or data misreport so far.

What could I do in order to check the root cause? Should I try to re-create the node from scratch?

arrogantrabbit · March 8, 2025, 6:03am

There is no traffic from saltlake. It’s a test satellite.

Btw, you can paste images directly into forum posts. No need for slow external image hosting.

Alexey · March 8, 2025, 11:02am

It’s not recommended to publish your dashboards to the internet - you are publishing your private data like a wallet and the email address.
Please use this method instead:

Or this one:

For the last one you can run this dashboard on any device, which you want to use as a monitoring hotspot.