Node loses used space number

The space on one node has been reset to zero for the second time. First time it has happened start of last month, now it has happened again starting this month.
Badger is on. After a restart the filewalker runs and updates to the correct used space number. Badger seems somewhat healthy as the filewalker completes fast.
On restart or during run I do not see any unusual errors. No database errors no badger errors or alike.
What else could I check?

Maybe you have had a database locked errors somewhere in between?
For me it sounds like a bad records in the database though. Maybe you need to re-create all related to pieces or reload them - then bad records should gone.

Are these piece_spaced_used.db and and storage_usage.db used_space_per_prefix.db?

Then I would simply recreate them.

Yes, they are. This is why I suggested it as a first option. Of course you would need to restart the node with enabled scan on startup to fill-up them back.

I have replaced

piece_spaced_used.db
storage_usage.db
used_space_per_prefix.db

Filewalkers on all satellites have completed already yesterday. However used space is still displaying completely wrong.

I see ERROR messages concering the badger like this:

2025-07-02T22:26:56Z    ERROR   blobscache      satPiecesTotal < 0      {"Process": "storagenode", "satPiecesTotal": -768}

Should I remove the cache files and restart the node?

It should be updated automatically. Since I’m out of ideas, why it doesn’t work on your node, then you may try to recreate a badger cache.

It goes up very slowly, like 1 GB per 3 hours or something.
However as the used space filewalkers have completed since yesterday I would expect that the space calculation from the should now be visible somehow.
The node is showing with 7GB used space now.

This could be only in the case, if the data is not updated in the databases for some reason, like the database is locked error.

I have checked the log once more there is no locked error and no database error.
I have restarted the node again and immediately the used space shows a much higher number probably the correct one. So I don’t know what the issue is.

Maybe it re-reads the stat after some time (like once per 12h or on the next restart) or maybe the filewalker is failed last time and finished after the next restart (the badger cache allows the filewalker to finish scan faster, because previously scanned pieces already in the cache, thus their second scan was faster).

It keeps losing it on a daily basis now. I just saw it again. There is not errors in the log regarding database or locked or something.

What I am seeing is a sudden appearance of lines concerning badger cache:

2025-07-07T19:10:16Z    ERROR   blobscache      piecesTotal < 0 {"Process": "storagenode", "piecesTotal": -4957772731164254184}

I suspect this is when used space has been lost. Before that there is only normal upload and download messages and even persisting bandwidth to database.

It’s a blobscache, i.e. databases, not the badger cache. It says that a row in a database has an incorrect data, like it didn’t update it after scan.

Please check, that you use the correct location for the databases (perhaps you have two? one is used by the node and the second which you think it should use).

Thanks for clarifying. For some reason I thought this is the badger cache complaining. I thought I had read that somewhere.

I see the piece_spaced_used.db-wal file and it has a current timestamp as modified date from 2 minutes ago.

I see my pieces used-space-filewalker completed messages for all satellites. Is there an additional info in the log expected that the database has been updated successfully? Something I could grep for?

Need to add: All this just started recently…

I do not remember to see any messages like “the database is updated” in an info log level, so I think only errors can show that something is wrong.
Could you please try to remove/rename only this database when the node is stopped?

It should force the filewalker to do a full scan. With the badger cache it shouldn’t be too long.

I have restarted it already as it provided at least a temporary fix.

The listing shows them as this currently:

-rw-r--r-- 1 root root    614400 Jul  7 08:40 used_space_per_prefix.db
-rw-r--r-- 1 root root     32768 Jul  8 06:42 used_space_per_prefix.db-shm
-rw-r--r-- 1 root root     32992 Jul  8 06:42 used_space_per_prefix.db-wal

At least the timestamps from the wal and shm files are up to date. I will rename all of this and see what happens.

They should not have .db-shm and .db-wal after the node is stopped.

Yes those went away after the node has been stopped.

Node is back up and running. And it seems that the badger cache is getting refilled DEBUG db.filestatcache writeRequests called. Writing to value log {"Process": "storagenode"}

Actually these lines seems to come from the regular upload action. But at least there is no indication of a badger malfunctioning here.

I do not think that the badger cache is a root cause, it’s still databases which didn’t update for some reason.