Updates on Test Data

They would be on separate vlans though. A switch should be able to handle that.

But yeah, the mods are going to have to split this into a new thread.

VLANS are on L2, switches are on L2. You canā€™t change the ISP provided routerā€™s mac, so by cloning it your switch now sees two MACs in its ARP table, both on different interfaces (one on your uplink port, one on the downstream isp provided router). It doesnā€™t know how to route things properly.

I get that, which is why we are discussing this: what has been working so far is having issues keeping up, true in some cases.

Be realistic: with 1PB of data ($1500/month just in storage) there is literally nothing you can afford to switch from ā€œbrokenā€ to ā€œworks perfectlyā€?

Seems like the testing is over for the time being.
Making adjustments or has ā€œmystery contractā€ fallen by the wayside? :slight_smile:

Switch should see the same MAC on port 2/VLAN10 and on port 3/VLAN20. If a packet with that destination MAC comes on VLAN 20 it should go to port 2 and if it comes on VLAN10 it should go to port 2.
Iā€™m pretty sure switches should not be confused with this. Unless the duplicate MACs are on the same VLAN, that would be a problem.

As for ARP - switch would only need that for the management interface, which can be assigned on yet another vlan to be in the internal network.

1 Like

If a packet comes in tagged for vlan 1 for MAC abc on your upstream port, and needs to travel untagged to MAC abc which is on your downstream port, how are you going to move this packet? You canā€™t. Itā€™s like looping an ethernet cable from port 1 to port 2. Why would the switch pick to send this packet to your downstream port? the mac is already on the upstream port.

I thought tapping out on the topic would do the trick since it was about my setup. Letā€™s keep it on topic people. We might need this thread for more testing.

3 Likes

In related news:

Wed Jun 05 14:43:22 CEST 2024 : NIC migrated from 10Gbit/s --> 25Gbit/s

Somebody is getting ready for the ā€œnew normalā€ :stuck_out_tongue_winking_eye:

The mods are going to split it anyway. Weā€™ll probably continue this in DMs.

Yes butā€¦ if you followed the rule of 1 node per hdd, with basic ext4, without caching, youā€™ll realize that even if you max out ram on your server you broke somethings or maybe not performing great (long gc, long fw all together).
Than you need to understand if you must change everything adding caching support (and maybe changing fs) or just wait to see if this new pattern will be real or not.

Thatā€™s the situation Iā€™m currently in. So far load on one of the node rigs was just fine. With the recent change in traffic pattern, incoming data takes precedence over running my lazy GCs and FWs. So Iā€™m left with two choices: wait it out, or add more RAM to it (to speed up lazy). One isnā€™t really an option since that mobo doesnā€™t support more sticks.

The only logical path forward for me is to just budget in a new mobo+cpu+ram upgrade. I was planning to do that anyways, so this just inconveniences me only to the ā€œhow fast should I do itā€ part. Iā€™m still using what I have, but Iā€™ve already hit the limit on that, so I need to upgrade. Is that against any of the guidelines? I donā€™t think so, I donā€™t expect storj to spend any R&D time on making my rig cope with the traffic. If they tweak things a bit and it stays good, then everything is fine. If they donā€™t and I still need to update, still everything is fine. Either way Iā€™ve outgrown that rig. If by adding another rig I move on one step forward, that only frees up this rig for a couple of more nodes.

If filewalkers/gc run in the backgroundā€¦ is it important that they be fast? Like if their runtime doesnā€™t affect your payouts, thereā€™s no reason to upgrade? Recent trash handling seems way better: so it shouldnā€™t be causing nodes to appear artificially full (and rejecting uploads) anymore.

The thing is that lazy (=running in the background) is exactly that: lower IO priority. As long as something else needs that priority (ie incoming data), lazy canā€™t run. You can only run it between tests, or when the satellite scales back the traffic. I can speed up lazy by adding RAM (so more metadata is cached) or plan the whole ā€œmortgage the house and move to ZFSā€ scenario. To me, RAM is easier and more cost effective.

No I will not consider LVM cache :rofl:

I get itā€¦ but is that important? Does it affect payouts? It sounds like a node would have to be so busy, 24x7, that it completely fills its space because it a) didnā€™t get an accurate report of the space itā€™s already using - so didnā€™t know to reject uploadsā€¦ or b) didnā€™t keep up with deleting trash.

Soā€¦ SNOs would want to improve fw/gc performance to avoid nodes falling over due to filled disks? Thatā€™s the problem youā€™re having? Storj is throwing data your wayā€¦ money your wayā€¦ faster than you can catch it?

I donā€™t have that problem :slight_smile: - fw/gc can take daysā€¦ my disks arenā€™t fullā€¦ Iā€™m not paying anything to speed up background housecleaning.

Literally my problem: canā€™t catch enough data because I canā€™t keep up with the deletes.

Shouting ā€œbut they told us to use what we have!!!ā€, isnā€™t going to fix this.
Spending $$$, will.

Off topic (again) the new zen 5 looks perfect on paper.

2 Likes

I understand your issue now: sorry to push. If I could start cleanā€¦ maybe with a 4u 24-bay caseā€¦ then yeah stuff in 128GB (or more) of RAM for ZFS ARC, and some mirrored SSDs for a special metadata device: and fw/gc would always run in secondsā€¦ and the HDDs would only receive the lightest occasional kiss :kissing_heart:

Butā€¦ for now I tend to the :potato: ā€¦ and spend my nights dreaming of the day I get to add a 20TB drive just for Storjā€¦ :zzz:

1 Like

If you cap your system ram you need to move to the second stepā€¦ and beleive meā€¦ youā€™ll cap in a basic system! :slight_smile: how much you can scale with a simple config? (double cpu 1.5tb ram basic ext4 setup basic jbod 6g/12g basic HBA controller) 100? 150? 1000 nodes? Few SNOs today have the answer. New pattern cut maximum nodes number that you can manage a lot.
Againā€¦ Iā€™m not talking about a couple of nodes.

No, Iā€™ll add a second rig at the 24 nodes (=24 drives) mark. Why do I need to run 1000 nodes off a single host? So I can justify me daisychaining whole racks of JBODs? Iā€™ll be capped at the controller level anyway.

1 Like

cost efficiencyā€¦ we are talking about KWs of hardware.

I think with ZFS you need about 1GB of RAM per TB disk space. So you would have to put in a lot more RAM if you want to run so many disks.