Problem with nodes - values reseting in a loop?

PePeR · June 30, 2023, 7:39am

Hey, so I noticed some problem recently with my nodes. I think it started happening with .80 versions. I have a few nodes and whenever they restart or update. The values reset. Used storage,trash etc.
Here is screenshot from before restart. It runned for almost 30hours.

Here are the logs of when the restart happened.

And here is the screenshot of the values after node restarted.

The values in the Total Disk Space graph are basically resetting every time to the same values. 8.01TB Used storage, 58GB trash etc.

Every reset they go back to 8.01 on this node.
I noticed it happens on pretty much all nodes after reset. These values reset to specific point in time and for other node it goes back to 6.01TB value.

I’m not really sure what is happening here. The rest of the values are progressing fine. Only these values in that ‘Total Disk Space’ Graph/section are looping to the same value.

Alexey · June 30, 2023, 7:55am

This is mean that a filewalker cannot finish the scan. If you disabled it, you need to enable it back.
Please also check your disk for errors and your databases

PePeR · June 30, 2023, 8:03am

I didn’t disable filewalker. And I don’t see any errors in my log regarding malformed database…
Also I don’t think all my nodes can have corrupted database… one node maybe yes, but this is happening on all nodes.

Edit:

I will try that integration check and report here

PePeR · June 30, 2023, 8:24am

Okay so I did pragma integration check with sqlite3:

Here are the databases I checked:

info.db
orders.db
satellites.db
secret.db
pricing.db
reputation.db
notifications.db
piece_spaced_used.db
used_serial.db
pieceinfo.db
heldamount.db
storage_usage.db
piece_expiration.db
bandwidth.db

All the checks return ok message. I guess the databases are okay.

Alexey · June 30, 2023, 8:34am

Then you need to wait until the filewalker is finish its job.
For me it looks similar to this issue:

PePeR · June 30, 2023, 8:47am

ok thx for help, is there any way to check if filewalker is done or if it’s still running? It would help for debugging

Alexey · June 30, 2023, 8:55am

You need to search for retain and filewalker in your logs

PePeR · June 30, 2023, 9:47am

ok so when I check my logs for retain this is what I get:

These are from 1-2 days ago. I counted 3 sats here. One ID repeats 3 times.
And when I grep filewalker I get these errors that u can see on the screenshot in the first post. Just at different times.
Here is screenshot

It means it’s still running or what?

I think node got updated and because of that filewalker didnt finish or something?

PePeR · June 30, 2023, 10:31am

Ok so I found something weird.
2 days ago I created a new node, and the same thing happens there… it has 80GB data and when I restarted the node today, it went back to 77GB…

Alexey · July 1, 2023, 6:53am

No, from the errors it looks like your disk is not keep up with requests and it cannot finish calculation of used space.
How is you disk connected to this device? What’s filesystem?

Nothing strange, some data likely got deleted (moved to the trash).

PePeR · July 1, 2023, 7:21am

disk is in USB tower connected via cable to the machine, filesystem is ext4. Node is via docker.
How can I be sure filewalker is done ? When I see retain logs from all satelittes? This error for calculating used space is done by filewalker on start of the node, then there is no more logs of that.

Alexey · July 1, 2023, 7:47am

Yes.

Yes, it’s working only on start. It also refresh the cache every hour. If it would not finish the calculation, I would suggest to check your disk for errors, make sure that your USB disk has an external power supply and restart the node.

PePeR · July 1, 2023, 8:06am

Ok I will monitor the node and check in few days. Thx for help

revyte · July 1, 2023, 1:05pm

You probably kill the filewalker with the restarts. First screen ist right before the shutdown. (Update)

Probably right before here, too.

With 8 TB it will very likely need to run more than 30 hours. If you kill the process it resets to last known values and start over next time till it can finish and saves new values.

I don’t know exactly there are save points between satellites or not but during the process I think there are not.

PePeR · July 1, 2023, 2:18pm

Thanks for explanation. I think that too, also new version update can kill the filewalker because of a restart too so that’s a little annoying but ok