Current situation with garbage collection

TBH, I don’t fully understand you comment, I think you misunderstand sg. (and sorry if I was not clear enough).

You can be an old SNO (like somebody who started 4 years ago), with a lot of pieces. Just be sure that your software (storagenode) is new enough (v1.101+).

This full thread is about supporting “old” SNOs. (==storagenodes with a lot of pieces)

It is the other way around. Lets say a big node is sitting on 20 TB used space but only 10 TB is paid by the satellite. The bigger bloom filter don’t change anything on the payout but they will remove a lot of the unpaid garbage and fee up space for more paid data.

4 Likes

I see that resent days significant amounts of data are wiped on my side. last 1-2 days

1 Like

Watching these days problems with walkers and garbage collection, I wonder if Storj can realy scale up to Exabyte range? And the main bottle neck I see for now are the Satellites. If generating a bloom filter for 30PB of customer data takes so much resources, I can’t imagine how will work for 1000PB.
I still consider Storj in a beta faze, and I realy have hopes that we will figure things out along the way, but still I am somehow worried…

4 Likes

I think that’s easy, you just have more satellites, such that generating filters for each satellite is manageable.

I’m sure there are other approaches the team can do for scaling, but definitely not a priority at this stage.

2 Likes

Generating bloom filters isn’t even in their top-50 list of business concerns - it’s a problem you can easily throw cheap hardware at. The only thing preventing them from scaling to the Exabyte range is a lack of customers to pay for that space.

It’s not even capacity: Storj has barely used half what SNOs provide after years… and other projects (like Chia) have 1000x the capacity online… run by users who would gladly repurpose their disks to Storj if it was worth their time.

1 Like

yup
i want to believe that is intentional,
they could lower the prices in the end, there is a margin to do cool discounts
but they just chose not to i guess, because too much customers could onboard too fast at once, and that could be lethal.

Summary

and they are constantly improving and improving. Everyone can see, new version every 1-2 weeks, they are grinding with coding lately!
Imagine onboarding too much customers too fast, could overload anyone.
Only the managment knows how much fuel left in the tank, and set strategy, to play safe,
to develop as much as possible, before make any bold moves that may lead to “break” the market. And You have to remember, the big boys won’t watch idly, if You about to piss them off, You better be ready with the development for the tough fight. So i guess they over all make best possible decisions for the conditions. Play relatively safe. Its not easy to make something revolutionary in the field. Gotto watch out and play smart, I believe we would do same. The developement just takes tremendous efforts, just take a look in the list of changes in every single new version.

Yes, indeed.

And here I was halfway towards a Lambo

It’s rare that a software system can scale more than two orders of magnitude without at least some hurdles, and even more difficult it is to predict the bottlenecks for a system as innovative as Storj. However, when Storj starts approaching 1 EB of data, I’m pretty sure they will earn enough to fund engineering efforts to optimize the process. My experiment shows there’s plenty of opportunities to do so, and they should not be difficult to put into production.

2 Likes

What evidence do you have for this? Do you have insight into the Storj sales team´s strategy?

2 Likes

:zipper_mouth_face:
Only presumptions :wink:
Frankly theres only 2 strategies.
To compete with price, and to compete with value.
To compete with price is the last thing anyone wants,
Storj is able to pull out big guns in the matter if necessary.
it’s a matter of course taken by the leadership.
That’s my conviction. And i’ll stop at that.

I’m still seeing 6.79TB of trash on ~20TB used. Should I be worried? Some of the nodes were restarted 3 days ago, others were 8 days ago. All of them are on 1.101.3.

2 posts were merged into an existing topic: Are automatic node downgrades expected?

Should there be an offical maximum node size recommendation based on the current bloom filter size?
Like: Now we generate 10MB bloom filters and we recommend to node opperators not to excede xx TB per node.
Now we increased bloom filters to 20MB and nodes should not excede yy TB.
And so on…

This is getting a little off the main topic though there is already a prerequisite about it in the documentation.

1 Like

Those recommendations were based on ingress/deletes, and they were at 84TB as I remember from un updated post.
But ingress/deletes it’s verry random and shouldn’t be used as a base for node size recommendations.
The bloom filter size instead (which is on topic) it seems it’s the real limiting factor.
If you manage to gatter 80TB of data and 60TB are unpaid trash because of bloom filter is too small, you are banging your head on the wall.

I am not sure if it is a technical limiting factor. The current situation is more that they did not foresee that the node sizes have increased so much that it requires dev work to make the code ready for that.
Now they are working on it and can increase it to 10 MB. And I am sure they would find ways to get them even bigger. So it may not be a technical but a strategic issue.
Same thing for the bloomfilter. They did not anticipate that keeping it in memory is not enough when at the same time you force updates on the nodes which require restarts roughly every 2 weeks.
This storagenode code design was nice for 500GB nodes. But now with nodes at 15TB you see where the code and the way it was designed fails.

More like speculations :wink:
The lowering prices is not what is a best effort, usually it’s a road to out of the business. Unless you are ok to not be paid (unlikely, right?).

I believe that the right segment and marketing is much better, than lowing price to the bottom.
You know Quality - Speed - Low Price, select only two.

I do not think that the Bloom filter size should be considered at these recommendations.

yes, i would even offer free egress to that nexus Mods whoever,
just to make them use of my free space, so i get paid $1,5/TB for HDD usage.
So Storj could offer them even less than $7/TB egress.

i have 0.3Gbps upload to offer, i don’t mind if they saturate it, i don’t use it anyway and i pay for it anyway.
i don’t think any S3 is a problem, the Website can offer a file installer, that customer downloads first, you know, You want to get that 130GB Fallout mod, and You first download an installer.exe like 1MB, that enables Your PC to establish a storj native connections from Your desktop PC and get that 130GB mode from Storj nodes.

3 Likes

Hm. I like your idea about some pre-app to download natively!
Especially when most of OSes have this feature, like one-click for Windows, flatpack for Linux and (I forgot how is it called, sorry) for macOS

1 Like