Node consuming more blobs than allocated storage

r0b0tn11k · March 1, 2020, 12:15am

Currently running version: v0.33.4
Allocated storage: 2TB

Storj reports: 222.1 GB Available / 1.8 TB Used
File system usage shows:

du -d1 -h
84G     ./trash
33K     ./garbage
3.0T    ./blobs
5.1M    ./temp
33K     ./blob
3.1T    .

I understand there is a 10% overhead required but this equates to a 50% overhead. To me that seems either garbage collection is partially working, or the node is ignoring the allocated storage quota and under-reporting the space used.

I can confirm I see piecestore deletion messages in the node logs.

Is this a known issue?

KernelPanick · March 1, 2020, 12:01pm

6.4TB Allocated
Storj reports 3.4TB Available, 3.0TB used

mine reports with NCDU Total: 2.8TB
2.7 TiB [##########] /blobs
62.8 GiB [ ] /trash

du -d1 -h rounds up, but it’s still 2.8TB total.

Alexey · March 1, 2020, 1:35pm

Hello @r0b0tn11k,
Welcome to the forum!

Did you have a previous node?

r0b0tn11k · March 1, 2020, 1:52pm

Thanks. Long time lurker and observer

No previous node, I’ve had the same ID since I got my alpha invite way back at the start. I moved all the data from one physical host to a new one with my same ID thirteen months ago. Back then I had the minimum storage allowance set but have since increased it to the 2TB set now.

Pac · March 2, 2020, 11:58am

Beware of the difference between TB and TiB.
Storj reports sizes in TB.
“df” reports sizes in TiB, unless you use “-H” instead of “-h”.

KernelPanick · March 2, 2020, 6:00pm

thanks, i did not know that.

Am using DU and NCDU instead though, because DF only reports attached disk / mount sizes. We’re looking for individual folder sizes.

r0b0tn11k · March 4, 2020, 2:08am

@Alexey - should I open an issue on Github for this particular problem?

Alexey · March 4, 2020, 8:47am

Since it’s only one occurrence, I think it’s a local problem with a database.
The issue would not help.

Please, try to restart the storagenode and see, how calculations are changed.

r0b0tn11k · March 4, 2020, 9:49am

Results after a restart are roughly the same as previous although I noted a slight decrease due to some deletions:

> du -d1 -h
12G     ./trash
33K     ./garbage
2.9T    ./blobs
5.1M    ./temp
33K     ./blob
3.0T    .

r0b0tn11k · March 21, 2020, 3:33am

This situation continues to worsen. I have restarted the storage node and I continue to consume more storage than what has been allocated. Now 1.75x the amount!!!

du -hd1
8.5G    ./trash
33K     ./garbage
3.5T    ./blobs
5.1M    ./temp
33K     ./blob
3.5T    .

Meanwhile, Storj reports:

                   Available       Used       Egress     Ingress
     Bandwidth       23.9 TB     1.1 TB     339.5 GB      0.8 TB (since Mar 1)
          Disk        5.9 GB     2.0 TB
Internal 127.0.0.1:7778
External <redacted>:28967

It’s really disappointing that outlier issues are simply brushed off like this without further investigation. At this current rate, my node will be full within 1 month through no fault of my own as my hard limits are ignored.

Alexey · March 21, 2020, 11:51am

I think you database has been corrupted and your node doesn’t aware of previous usage.
It will use the current limit as a threshold.

BrightSilence · March 21, 2020, 1:02pm

I advise lowering your limit to compensate for the data it’s unaware of. It’s not ideal but the alternative is starting over, which would be worse.

Pac · March 21, 2020, 1:05pm

Isn’t the node software checking for all the files after a restart? Shouldn’t it notice orphan files if there were any?

r0b0tn11k · March 21, 2020, 1:20pm

I guess it’s worth also taking @Alexey’s point that there is likely a DB corruption that has happened. Is there a procedure or instructions on how to export the DB records for valid blob’s/pieces and recover these into a new DB?

It sounds like I will always have this issue until the DB has been repaired and I can either;

a) start over,
b) lower the storage limit and get paid less despite consuming more storage, or
c) rescue the database somehow

Alexey · March 21, 2020, 2:00pm

Unfortunately we do not have a such procedure at the moment.
The idea is to remove dependency on databases during the improving of storagenode, but we does not there yet.