I’ll start with the background…
Since December 1, my total storage has grown by 100+ TB.
However !
I did not feel proportional to this growth, an increase in income.
If storage on December 26th is 100 TB = 150 usd more than on December 1st, then on December 26th I should receive approximately +5(150$/30days) dollars by December 1st. But this did not happen. Then I started looking at my charts.
Let’s look at the previous month - November (One of many nodes, size is not important - proportion is important).
Yeah, actually I always keep asking the same question. Explanations so far aren’t satisfactory to be honest.
Cluster size: of your clusters are too big, for example 2K+ you’re losing 1K on average for each file. Since average file size is 16KB (as I thought), this can explain up to 12% discrepancy of you’re using 4K clusters. But I’m using 128B clusters on ext4, so there should be less than 1% discrepancy contributable to this.
Metadata: inode tables/trees, size, dates and so on; typically 1-2% of the file system.
Missed deletions: when nodes are down, deletions can be missed and it depends on the bloom filters (general filter describing all file names of data on your node) to check whether files haven’t already been deleted in the past. Since bloom filters are overgeneralized (to make them more compact), they typically also match with 10% of deleted data. But my nodes all are 100% up.
Slow file systems, not keeping up with pace of deletions. But have discrepancies over 10% on even SSD file systems.
It is not even theoretically possible to use 128-byte sizes, the smallest ext4 supports is 1 kB. You are probably thinking of inode size, which is a different thing.
The reported used space by your node (signed orders) maybe different from the physically used space for a lot of reasons, described above, and also (mostly), that the filewalker didn’t finish its job and didn’t move the garbage to the trash.
You need to search for errors, related to gc-filewalker, lazyfilewalker and retain in your logs and possible fix them.
If all filewalkers working without issues, the discrepancy should be a few percents.
P.S. I do not think, that generating more topics for the exact same problem will speedup a process to fix it.
Alexey!
The people here are mostly adults and some of those gathered are even quite educated.
These are very convenient answers - I saw them all, but they do not answer the question with which I started the topic - where did the income from the fact that storage increased by 100+ TB since the beginning of December disappear?
For myself, I concluded that such regrowth without income growth is not interesting to me, I am transferring all the nodes to constipation.
Thank you for your attention.
PS: I think it would be nice to compensate operators for all these existing problems in the software, say by 25%
Yea some extra tokens like some one time Christmas gift, or New Year gift now.
I have some screens too, from the even beginning of the year 2023, where $20/TB for egress was a case, and even then, was some discrepancy between whats average TB storage in stats on left side, and whats used at right side, like 1-2TB, but that really didn’t matter that was peanuts when i got paid $22-26 for that 7TB node, now i got like $7,5-8 for the same node, and i just recently was able to complete the full filewalker and turned out that there was 2TB files to be deleted, that didn’t even was in trash, but now its fine.
(i mean as much fine as it can be, with a $1,5/TB/storage rate, instead of a much better preferred by SNOs: a $2/TB/storage, but hey! hah)
Instead we all complain in infinity for that discrepancy in many topics,
yea sure many nodes lost some pennys like $1,5-3 of potential profits, but its what happens sometimes when You create great things, and constantly make things better.
STORJ Inc. You could just i don’t know, make some one time extra surge, and apologies for the inconvenience of the situation, and we will be one big happy fammily again, at least for some time! )
I do not have this information. The only known source is logs of the node, it contains information about bloom filter size and how much pieces were moved to the trash or deleted.
Between which numbers? Satellite and Storagenode? Or Storagenode and disk usage? Did you exclude the trash usage?
GC might be failed last time… Did you see any error in the log? Or metrics about processed pieces?
1TB sounds fishy (unless you store 1PB on one node) Would be happy to help to debugging more, just send me more information (here/private or marton@ and storj domain)