Updates on Test Data

Aitor · June 20, 2024, 10:33pm

If the network is currently at 50% usage, I don’t quite understand the problem. Is that mystery customer going to use it all? really?

littleskunk · June 20, 2024, 10:44pm

Isn’t that what the current load is doing? It is way more precise that any number we could give you. And to make your estimation as acurate as possible I can tell you that we have uploaded maybe 75% of our target. Hard to tell at the moment because of the trash folder but still good enough for your calculations.

BrightSilence · June 20, 2024, 10:50pm

That certainly helps. But only for SNOs who had free space during the full testing period. Which happen to be the nodes who likely don’t need to expand right away. Although the overwriting thing kind of throws a wrench into the numbers as well. There’s nothing wrong with mentioning that that is the best way to gauge it, but I still suggest to add estimates.

I forgot to mention that what would also help a lot is if GC is more reliably cleaning stuff up. I still have nodes with 25-30% of data being uncollected garbage. I have much more capacity than is being usefully used right now. Should I count that towards the test data as it’s caused by overwritten segments or not?

littleskunk · June 20, 2024, 10:54pm

We can’t. I can give you some total numbers. Lets say our target would be to upload 10 PB. Does that mean 10PB*1.875/20K nodes? That is a TB per node so why has my node grown way more than 1 TB? You see this math doesn’t work. I can’t estimate how big your node will grow.

Roxor · June 20, 2024, 11:10pm

The guess I’m going with is I have about 2 more weeks of growth like the last 2 weeks… then all the TTL data will start to wrap… and I’ll level-out. After that I’m back to natural slow growth.

Maybe that will fill a drive: we’ll see.

Roxor · June 20, 2024, 11:50pm

If you’re looking at the graph most of us check: it’s counting slightly different things so we’re closer to 2/3rds full (assuming real customer data uses about 2.2x the free space when uploaded). Perhaps when capacity-reservation is complete we’ll be near 3/4?

EricTheRed123 · June 21, 2024, 12:25am

Please tell me it’s more than 10 petabytes.

littleskunk · June 21, 2024, 12:54am

Sure no problem. It is more than 10 petabytes.

jammerdan · June 21, 2024, 4:54am

I am wondering if you are aware that there might be other reasons for SNOs not adding capacity? I tried to make that clear in my post:

From what I see on Github I would have to wait at least until my nodes are on version 1.106 and then maybe get a picture what is currently really used and what’s not. And all the old deleted stuff finally cleaned out.

The unreliability of the numbers everywhere but also of the storagenode software and the satellites together with slow rollout of fixes because it is “low priority” is currently my main blocker to add anything.

It is also what @BrightSilence has said:

There is a lot on the forum about issues with the databases, space discrepancies, problems with filewalkers, garbage not deleting, nodes restarting and I see all of that on my nodes too. The way this all works with so many issues at the moment is beyond my comprehension.

So even if you would make such a forum post today detailing a signed deal I would not add and go for additional capacity or nodes.

My suggestion on adding more customers with large space requirements I made here:

Edit: Just to give you an idea what I am talking about: If I look at the average used space over all nodes this gives me only 40% of the space that all my nodes claim they are using. And the trash over all nodes is telling me its size is 1/3 of the reported average used.
And this does not add up. Running du give me complete different space than the nodes do. Some must have tons of still uncollected garbage on it while some have not correct updated used space databases as filewalkers do not finish.
So basically even if I wanted to I could not tell how much space is used altogether, how much trash is there, how much garbage has notbeen collected and on.
So it is really impossible to make a decision to add even more terabytes to this obscure situation.

Edit2: The resume feature for used space filewalker is also one of the features I need to get the numbers correct.

Alexey · June 21, 2024, 5:00am

Only online not full nodes are selected also using the choice of 2

gingerbread233 · June 21, 2024, 5:47am

They are basically crappy USB-Sticks in a “SSD” Case
At least his USB-Stick is from Samsung…

Solu · June 21, 2024, 6:19am

I think there is a significant amount of SNO who are willing to upgrade as soon as their disk are filling. But the speed of the expanding depends… mostly the subnet limitation will lead to the “one by one” approach which i already mentioned here.

So the first question from SNO side is: Do you need just contunously one node with space available at one ip or do you need multiple nodes behind the same ip for distribution? If second, how much nodes makes sense for you?

Because from the SNO perspective there is no difference in incoming traffic but only in cost of disks.

Currently i see about 1.5 to 2.5TB/day ingress per ip on my nodes, that will fill 18TB disks (16TB usable) at a rate on 8 days, maybe more if some data will be deleted…
For me i can keep up the pace and add one more node per week (currently up to 104).

The second question is: Do you provide information about the duration of the signed contract?
If the contract term was signed for lets say 12 months, the SNO run their math and could decide if it is worth it to spin up even more nodes.

Third and final question: How long in advance are we informed before the customer data arrives?

snorkel · June 21, 2024, 7:22am

Samsung says on the cover, but my Syno identifies it as Silicon Motion Taiwan.
I don’t know who makes the chips and the controller if the digital ID says one thing and the cover says other, if it’s Samsung or just a rebranding.

gingerbread233 · June 21, 2024, 7:24am

Maybe you misread Samsung for Smasnug, and it’s just a knock-off too

snorkel · June 21, 2024, 7:28am

Ideea for trash optimisation:

the trash is just a safety backup, right?
why do we need to backup all the pieces?
we just need the smallest number of pieces to recreat a segment or a file or whatever.
so let’s store only the smallest number of pieces in trash, by cleaverly choosing the nodes that store those pieces evenly, not to overload some and free up too much on others.
This should be part of the bloom filter generation process.

snorkel · June 21, 2024, 7:30am

On 8 sticks bought in sealed retail packages from reputable store, with all the serial numbers, logos, materials etc.?
Just buy one and test it.

arrogantrabbit · June 21, 2024, 7:39am

It’s likely has Samsung nand chips (barely passing the QA, bottom of the barrel yield, instead of sending them to landfill they make cheap consumer thumb drives because nobody expects any reliability from them) and silicon motion controllers – because they are cheap. Again, cost drives lack of bothering with custom firmware to report “Samsung” for the final product… it’ garbage all around.

Generally, Samsung is great at manufacturing parts – nand and DDR chips mostly, but that’s it. Every single end user product made by Samsung is incoherent trash. These are no exceptions.

Alexey · June 21, 2024, 7:44am

Yes… but some operators are deleting the trash manually or disabled it in their config, so the remaining pieces could be not enough to re-create missing pieces.

Your node should store only one piece of the segment. So 1 of the current required amount. You cannot store “some” pieces in the trash, protecting only “some” segments. It doesn’t make sense.

BrightSilence · June 21, 2024, 8:10am

You can certainly do a whole lot better than that. I was careful in my message to say subnets with free space. That wouldn’t be close to 20k. And yes, I realize that the choiceofn node selection makes it more difficult, but you could calculate and average and mention that the fastest nodes would get n times more than the average. I’m betting if you do those things, your node will suddenly fall neatly within that range.

I’m happy to do my own calculation if at the time you can share the number of subnets with free space, the total amount of data expected to be uploaded and the n you settled on. But it would be better to just provide the results as some guidance. Because many will likely not know how to do that calculation or will simply not bother.

I’m not surprised. This testing patterns is significantly different from how the network used to behave and this is what testing is for, right? So rather than airing my frustration and bashing the current situation, I just stick to calmly informing Storj that this will and does impact my behavior in regards to upgrading. That’s all they really need to know to know they should fix these things. I’m sure they will, but I’m also not surprised that issues popped up under such conditions.

This is not possible with how bloom filters work. They just know your node shouldn’t have this piece but aren’t aware of which other nodes have pieces for the same segment. You can put a random file that was never even uploaded through store in the folder and the bloom filter will clean it up too. It’s not a list of files to be deleted, but it’s a cleaver filter that would match all files to keep.

littleskunk · June 21, 2024, 9:03am

I thought so at the beginning but it turns out any math I can come up with just gets slapped by reality. My grow rate is simply off the scale and I haven’t found a math that would explain that. Instead measuring my grow rate for a day and estimate my size after 30 days seems to be very accurate.

Sure try your luck. 6K at the moment, 20, 6-8. So just by math this gives us a range of 0.8TB for slow nodes and up to 50TB for fast nodes. And now? If you would be in my situation what kind of numbers (=promises) will you put out there? 0.8TB? That is a bit low isn’t it? But 50TB would also be unfair. But please let me know if you can come up with some numbers that will work better than mine.