The trash is unpaid?

For US1 my numbers are 6M pieces per TB. How big would be the 24 TB bloom filter now?

I linked the calculator, you can just change the numbers.

1 Like

If I’m using the calculator that @Toyoo provided correctly, it comes out to ~82MB

Yeah, that’s why I said it’s tricky to estimate. It used to be simple to just translate the average segment size into average piece size by dividing by 29. It’s no longer so anymore. Right now I see the number of 4.29 MB average segment size, which by old RS numbers leads to 148 kB piece size. Then 24 TB would store 162M pieces, which doesn’t sound plausible to me. Only @littleskunk knows how to estimate the relevant numbers now.

Why should this be not plausible? My real numbers for average file size are close to that.

|SLC|141879|
|AP1|253094|
|US1 |164628|
|EU1|677965|

2 Likes

Ok, well, indeed… that’s scary then. And disappointing. And discouraging.

And people were right all along about complaining. And something should be done about it at a high priority. And thanks to the community for identifying a critical part of the software isn’t working as expected.

Please go on, don’t stop there :slight_smile:

1 Like

Not for the customer. Basically they cannot undelete these pieces, this feature usually implemented a client-side, not server-side. But you are correct, these pieces will be permanently deleted only after a week as a fastest possible time.

Our trash folder is a technical protection against any error with server-side deletion, it’s not offered as a backup option for the customer. If it would, the customer will pay for that and thus nodes will be paid too. Unfortunately it’s not the case, this one is a part of the protocol, not a feature for the customer.
As @BrightSilence said:

You are correct, the situation with the garbage should be fixed and as fast as possible, because keeping the removed data longer than designed is not a feature, it’s a bug.
And as far as I know that the bigger BF is implemented.
We need to check does it enough for your big nodes now, but I do not know how to do it, since you have

You should start by seeing if the theoretical limits (numbers are provided by others a couple replies back) are actually workable with nodes out there. If theory says that the bloom filters can’t reliably give a 10% fail rate for the maximum expected node size, then in practice they can’t and we don’t need to go into analyzing any logs or node behavior as to why. They are simply too small.

If theory says that “nope, the current bloom filters undeniably are within 10% fail rate” then I’m sure someone out there is running a 20TB node on info log and can share his/her/its logs.

I could have sworn that I saw a 24TB max node size being mentioned in the documentation but can’t find it now. If my car’s manufacturer says this car can go up to 220km/h but I can only get it up to 120km/h, then something is either wrong with the car or the documentation. That should be the node target size for estimating the size of the bloom filters. There is no point in concentrating on 500GB nodes if the documentation says up to 24TB, IMHO.

Actually even for a required 82MB we need to send 6 BFs of 14MB size I think. And perhaps we already do so, at least I saw this in my logs:

Wait, how does splitting a BF work? That means each of those filters is basically telling the node to keep everything, since if the filter is that small, you can’t have the node deleting half of its stored data in one go, or am I misunderstanding something?

I read this explanation:

and all tasks there:

I assumed, that it’s possible. However, I’m prefer to get an explanation from @elek

In the first reply on the github issue tracker, the satellite value is 26_000_000 (which I’m assuming is number of pieces). I know pieces probably isn’t the right word and someone will come and correct me as soon as possible instead of focusing on the rest of this reply.

Ok, sounds like it checks out based on two independent reports. An 8TB node that exited saltlake is basically split even between US1 and EU1 (AP1 isn’t even worth mentioning). Let’s go with 4TB * 6M = 24mil pieces (which is wrong word, I know, someone will correct me soon as I said).

Let’s be clear on this: the old 4MB situation fell apart for every node that is seeing about 4TB usage per satellite. After that you are basically storing data for infinity with no payout. This was true up to the bloom filter expansions.

If the blooms are getting bigger, that’s a step in the right direction. The github issue though is 6 months old. I can understand “not a priority right now”, but this begs the question when exactly will it be a priority? When storj starts posting that “we asked everyone to add space and nobody added a single byte!!!”?

Is this too offensive and/or hostile or am I justified in being a tiny bit agitated?

We don’t really split BF, it should be one big byte array. For technical reason, big bloom filters are sent in multiple requests, but last request contains a hash/checksum, and it will be concatenated on the SN side, and hash is checked.

Logs show the processing of the full, concatenated BFs.

You may see more frequent BF generation as we try to send them out more frequently, but there are new problems related to Saltlake, it has a lot of new segments, and scheduling should be adjusted (we need enough time to delete the US1 data from the BF generation machine, and restore SLC). That’s a technical, easy adjustment, but it makes hared to predict BF generation.

6 Likes

Is the file under the retain directory that triggers the actual GC run the “complete” bloom filter? What’s the limitation on the node’s side to have those huge (100MB) bloom filters, RAM?

According to how many times you repeat that looks like you sincerely believe in it. For some reason it reminds me “Django Unchained”, one of the characters from it. :sweat_smile:

2 Likes

Yea i get it,
all i mean is just make customer pay for the protocol, in the end, its him, who uses it, with all the the good … and some bad, like this one.
Because there is no free lunch, someone always has to pay for it.

Well if true, then wow, if that would be like, say, 11 days, a trash files time in storagenode,
then imagine the havoc SNOS will do if majority of data will be TTL with mostly 30 days, then constant rotation will make a good chunk of node always unpaid, because of trash sitting above 7 days, NOW thats little bit more serious, if 30% of full disk space happens to be unpaid, ya know? hah… oh boy.

beside, my full space-used.filewalker takes now like 9,5h for a 14TB of space.(upgrades)
if retain for bloomfiler is similar in scanning all files, is it?
then how often You want to run it? if often it will make impact on nodes disk access time for customers to get egress (read time from disk)

I am sure I have given the relevant numbers to all of you. Current RS numbers are 16/20/30/38. We might still change them a bit but for now that is what you get. (SLC only)

3 Likes

Just to be clear for everyone, each satellite has it’s own bloom filers. So if you have a 24TB node and you run all 4 active sats, then you get 4 bloom filters that cover each satellite’s data, each satellite’s piece of that 24TB pie. You could have 10TB stored by US1, 10TB for EU1, 1TB for AP1 and 3TB for SL.
You just need BF big enough to cover that 10TB for the big sats, not the entire 24TB.
Stop making wrong calculations.
My question is: does the TTL data from Saltlake needs BF also?
Because on my nodes, it starts to approach 20TB of test data. If this needs a BF, than that’s a problem.

1 Like