Updates on Test Data

Pentium100 · June 7, 2024, 11:18am

They would be on separate vlans though. A switch should be able to handle that.

But yeah, the mods are going to have to split this into a new thread.

Mitsos · June 7, 2024, 11:20am

VLANS are on L2, switches are on L2. You can’t change the ISP provided router’s mac, so by cloning it your switch now sees two MACs in its ARP table, both on different interfaces (one on your uplink port, one on the downstream isp provided router). It doesn’t know how to route things properly.

Mitsos · June 7, 2024, 11:24am

I get that, which is why we are discussing this: what has been working so far is having issues keeping up, true in some cases.

Be realistic: with 1PB of data ($1500/month just in storage) there is literally nothing you can afford to switch from “broken” to “works perfectly”?

ACarneiro · June 7, 2024, 11:24am

Seems like the testing is over for the time being.
Making adjustments or has “mystery contract” fallen by the wayside?

Pentium100 · June 7, 2024, 11:25am

Switch should see the same MAC on port 2/VLAN10 and on port 3/VLAN20. If a packet with that destination MAC comes on VLAN 20 it should go to port 2 and if it comes on VLAN10 it should go to port 2.
I’m pretty sure switches should not be confused with this. Unless the duplicate MACs are on the same VLAN, that would be a problem.

As for ARP - switch would only need that for the management interface, which can be assigned on yet another vlan to be in the internal network.

Mitsos · June 7, 2024, 11:28am

If a packet comes in tagged for vlan 1 for MAC abc on your upstream port, and needs to travel untagged to MAC abc which is on your downstream port, how are you going to move this packet? You can’t. It’s like looping an ethernet cable from port 1 to port 2. Why would the switch pick to send this packet to your downstream port? the mac is already on the upstream port.

BrightSilence · June 7, 2024, 11:31am

I thought tapping out on the topic would do the trick since it was about my setup. Let’s keep it on topic people. We might need this thread for more testing.

Roxor · June 7, 2024, 11:36am

In related news:

Wed Jun 05 14:43:22 CEST 2024 : NIC migrated from 10Gbit/s --> 25Gbit/s

Somebody is getting ready for the “new normal”

Pentium100 · June 7, 2024, 11:36am

The mods are going to split it anyway. We’ll probably continue this in DMs.

agente · June 7, 2024, 11:39am

Yes but… if you followed the rule of 1 node per hdd, with basic ext4, without caching, you’ll realize that even if you max out ram on your server you broke somethings or maybe not performing great (long gc, long fw all together).
Than you need to understand if you must change everything adding caching support (and maybe changing fs) or just wait to see if this new pattern will be real or not.

Mitsos · June 7, 2024, 11:45am

That’s the situation I’m currently in. So far load on one of the node rigs was just fine. With the recent change in traffic pattern, incoming data takes precedence over running my lazy GCs and FWs. So I’m left with two choices: wait it out, or add more RAM to it (to speed up lazy). One isn’t really an option since that mobo doesn’t support more sticks.

The only logical path forward for me is to just budget in a new mobo+cpu+ram upgrade. I was planning to do that anyways, so this just inconveniences me only to the “how fast should I do it” part. I’m still using what I have, but I’ve already hit the limit on that, so I need to upgrade. Is that against any of the guidelines? I don’t think so, I don’t expect storj to spend any R&D time on making my rig cope with the traffic. If they tweak things a bit and it stays good, then everything is fine. If they don’t and I still need to update, still everything is fine. Either way I’ve outgrown that rig. If by adding another rig I move on one step forward, that only frees up this rig for a couple of more nodes.

Roxor · June 7, 2024, 11:53am

If filewalkers/gc run in the background… is it important that they be fast? Like if their runtime doesn’t affect your payouts, there’s no reason to upgrade? Recent trash handling seems way better: so it shouldn’t be causing nodes to appear artificially full (and rejecting uploads) anymore.

Mitsos · June 7, 2024, 11:58am

The thing is that lazy (=running in the background) is exactly that: lower IO priority. As long as something else needs that priority (ie incoming data), lazy can’t run. You can only run it between tests, or when the satellite scales back the traffic. I can speed up lazy by adding RAM (so more metadata is cached) or plan the whole “mortgage the house and move to ZFS” scenario. To me, RAM is easier and more cost effective.

No I will not consider LVM cache

Roxor · June 7, 2024, 12:10pm

I get it… but is that important? Does it affect payouts? It sounds like a node would have to be so busy, 24x7, that it completely fills its space because it a) didn’t get an accurate report of the space it’s already using - so didn’t know to reject uploads… or b) didn’t keep up with deleting trash.

So… SNOs would want to improve fw/gc performance to avoid nodes falling over due to filled disks? That’s the problem you’re having? Storj is throwing data your way… money your way… faster than you can catch it?

I don’t have that problem - fw/gc can take days… my disks aren’t full… I’m not paying anything to speed up background housecleaning.

Mitsos · June 7, 2024, 12:14pm

Literally my problem: can’t catch enough data because I can’t keep up with the deletes.

Shouting “but they told us to use what we have!!!”, isn’t going to fix this.
Spending $$$, will.

Off topic (again) the new zen 5 looks perfect on paper.

Roxor · June 7, 2024, 12:21pm

I understand your issue now: sorry to push. If I could start clean… maybe with a 4u 24-bay case… then yeah stuff in 128GB (or more) of RAM for ZFS ARC, and some mirrored SSDs for a special metadata device: and fw/gc would always run in seconds… and the HDDs would only receive the lightest occasional kiss

But… for now I tend to the … and spend my nights dreaming of the day I get to add a 20TB drive just for Storj…

agente · June 7, 2024, 1:16pm

If you cap your system ram you need to move to the second step… and beleive me… you’ll cap in a basic system! how much you can scale with a simple config? (double cpu 1.5tb ram basic ext4 setup basic jbod 6g/12g basic HBA controller) 100? 150? 1000 nodes? Few SNOs today have the answer. New pattern cut maximum nodes number that you can manage a lot.
Again… I’m not talking about a couple of nodes.

Mitsos · June 7, 2024, 1:20pm

No, I’ll add a second rig at the 24 nodes (=24 drives) mark. Why do I need to run 1000 nodes off a single host? So I can justify me daisychaining whole racks of JBODs? I’ll be capped at the controller level anyway.

agente · June 7, 2024, 1:27pm

cost efficiency… we are talking about KWs of hardware.

littleskunk · June 7, 2024, 1:28pm

I think with ZFS you need about 1GB of RAM per TB disk space. So you would have to put in a lot more RAM if you want to run so many disks.