Two weeks working for free in the waste storage business :-(

Storj devs knowing about the bloom filter size issue for ages, actually since they introduced bloom filters. The current maximum bloom filter size seems to be suitabel only for nodes smaller ~8TB. The recommended setup is one HDD per node and 24TB is the largest available HDD atm. Why is it not possible to make a 10% false positive rate bloom filter for such nodes? Isn’t this just a matter of computional ressources dedicated to the filter generation?

1 Like

I define rubbish as uncollected trash data.

As I said in another thread, my used space keeps going up (verified with the actual on disk space), my trash keeps hovering around the 10TB mark please see note below, yet my actual in wallet payment stays the same month after month (actually drops a little). There is no estimation in any of this, nor am I relying on any satellite reported data.

Note: It is not an issue with GCs, trash-cleanup or used space. Each of those is successfully completing and I’m verifying that they do in fact traverse directories using lsof. Used space was run 2 times in the past month for each node, and I’m currently re-running it once more, one node at a time, just in case (see github issue about lazy not updating DBs). Databases on SSD, no locked errors.

To put it in simpler words: actual used space up, monthly payment slightly down. If that doesn’t scream issues with collecting all the trash, I don’t know what will.

To my untrained eyes, that means that bloom filters don’t have the 10% fail rate that they are supposed to have. I see it as a limitation of how many pieces they can match, based on their size. They are leaving behind a lot more than they are supposed to. These observations aren’t a week old, I’ve been suspecting something is wrong for the past 6 months after I saw most of the space occupied, yet monthly payments drop by a couple % every month.

3 Likes

That has been fixed weeks ago but since you are looking at payout data you will have to wait one more week.

So it wasn’t working for the past year?

I thought exactly this data reported from the satellites is used to calculate the Average Disk Space Used This Month and your script is used it too, so I expected that it should be affected too?

My script uses the last reported disk usage from the satellite to calculate uncollected garbage. Not the average. So it would only be impacted if that last reported disk usage is wrong.

3 Likes

I see, then I would keep that in mind, thank you!
Especially for this useful script!

1 Like

Thanks to @BrightSilence, @ACarneiro and others for making this much clearer, I am new to this still but learning fast hopefully.

@Alexey, so now its much clearer, what I dont want to happen is have yet another thread with no resolution or way forward. So, same question really.

What is the plan to reduce the uncollected gargage, in my experience, 30% of storage that is wasted by StorJ and unpaid? Its been many weeks for me since this all started. We have looked at file walkers, and other distractions like average disk usage. What I am refering to is the usage by StorJ vs what you are paying on - the rest your wasting and not releasing back to be used for paying data.

I dont think its an unreasonable question, is there a plan? Do we know what the issues is, is it a lack of timely Bloom Filters (my theory as I never see many)? Can you find out please?

Thanks
CC

2 Likes

I believe only to send BF more regularly than now.
I have no solution for this, sorry.

Thanks @Alexey for confirmation, that there is nothing that can be done.

So nothing StorJ or anyone can do - can you confirm that is the offical line please so that I can make decisions over not just this nodes future, but my wider support as a customer, reseller of StorJ to my clients?

Thanks
CC

1 Like

I would say that we have seen quite a state of flux with the network and some issues were identified.
The mitigations will take some time to come to full effect.

I can understand that, for a potential new starter like @Climbingkid this may sound a bit off putting, but the bloom filter issue is being sorted so it’s a bit of matter of waiting for the dust to settle.

The majority of my nodes haven’t got a huge amount of uncollected garbage, so I’m remaining optimistic.

The Storj can do something here. However, it will not be immediately. I shared the issue with the team, so I would hope, that it will bring more attention.

1 Like

The “wasted space” issue doesn’t really impact on StorJ clients, though. It’s an economic consideration for SNOs.

@Alexey Thats what I was hoping to hear. Thank you

1 Like

@ACarneiro

I understand that. My point was that if it starts to appear that StorJ are proping up their business by exploiting the goodwill of SNO’s that is not a sustainable business model long term, and I will pivot my business away from using StorJ for my customers.

Thanks
CC

1 Like

Well, my personal opinion is that StorJ have never seemed to be in any way exploiting the good will of SNOs. I doubt they would have survived this long if that were the case.

In my experience as an SNO over the last few years they are very acutely aware that they need SNOs and seem to go out of their way to listen to concerns and engage with them when making strategic company decisions (even unpopular ones).

So the good will to engage is very definitely there. I feel like I have mostly been treated in good faith (even if at times communications can a a tad opaque) and I am willing to extend them the same courtesy.

I accept I have little to lose if they fold tomorrow and this is just my own personal opinion which counts for very little, but there seems to be a lot of negativity going on and I just wanted to counter it a little bit :slight_smile:

2 Likes

I agree with balancing against negativity: and I know most SNOs are happy and we’re only hearing from the vocal few who complain (sometimes with good reason).

Did Storj pay me? Yes… OK I’ll run another month.

Did I fill a disk? Yes… OK I’ll add another

Did I get an email saying my node is offline? No… OK I’ll just live my life and ignore my node.

Keep it simple. Simple is happy! :smiling_face:

4 Likes

Of course storj could make bloom filters bigger and coming more frequently. I even believe they have it on some todo list. But how SNOs do is lowest priority obviously.

@Roxor

Not at all being negative, just want to get to a resolution over what needs to be done by who. I think we got there, GC is broken, likley lack of bloom filters, but Alexy is raising with StorJ

Would you really just keep adding more disks if they are in-efficiently used, 30% filled with garbage thats just stuck that you are not being paid for? Nuhhhh

Thats is what I refer to when I talk about goodwill of SNOs - its ok for a short while if there is an issue, and is being worked on - but not sustainable, if it goes on for a long time without being prioritised by StorJ that starts to feel like they dont care. Just feels like its being ignored for many weeks now.

Nothing negative, just realistic.

Thanks
CC

4 Likes

It’s fair and I’m with you (I’m a SNO too). I believe it would be solved, because now it may affect the prospective customers because of half of the network is filled with a garbage.

2 Likes