Today will start a bit slower. Before we can go on with full benchmark tests we need to deploy the power of 2 node selection to all production satellites first. Any satellite that keeps running the old node selection will get upload errors the moment SLC pushes the network to the limit. We do need a few more code changes so this will take at least a few hours if not days.
In the meantime we will slow down on the benchmark tests to not impact the other satellites. So first test today will be a comparison of upload duration between old and new node selection. That is a different test focus than throughput. I am sure you will notice the drop in traffic. Donât panic.
Satellite are all running the new choice of 2 node selection. (I used the wrong term âpower of 2â earlier. Lets try to call it choice of 2 from now on.)
Time to find out the maximum throughput we can reach now. So this time there will be a higher load again.
Iâve added some logging to my nodes and now I see that pieces uploaded with TTL=30 days (I presume: test traffic) have a fixed order size of 2319360 bytes, even if the piece itself is 2048 bytes. This is worrying, as this would make traffic accounting in 1.104.5 nodes way off. This is exactly the part of my patch that I wanted to investigate before making a pull request.
At least the patch was reverted later, so up to date nodes will again have the right numbers.
Just wanted to reply to this - this is only kind of true.
It is true that the Satellite is now going to keep track of recent success rates in a system-wide data structure (a speed profile, I guess), but we actually have many of these speed profiles to choose from. We are going to be choosing which speed profile to use based on where the request is coming from.
Right now, this functionality isnât enabled, so yes, your description is true /this week/.
Our next step is to have every regionâs S3 gateway (where most of our traffic comes from anyway) get their own speed profile (once I can figure out why https://review.dev.storj.io/c/storj/edge/+/13309 isnât passing the build). And then after every gateway region is handled, then weâll explore what we can do for remaining native integrations.
So Iâm watching this and the binary storagenode process, during these tests, is doing a CPU load of over 700% with IOwait% being pretty much insignificant. This is on x86 CPU with plenty of RAM and SSD cache and with logs on ramdisk.
It looks like the process is quite CPU heavy, I assume the more the smaller the segments gets.
Are there some plans to optimise the storagenode code more, maybe doing binaries for different, more advanced CPU flags if such thing would help?
As with such high CPU loads and PPS, even the ping to the local gateway is quite erratic and the node is loosing significant amount of races.
I can probably throw more cores at it, but that probably isnât an option for someone running this on a low powered device or for someone running this at scale.
I wonder how Synos and similar devices are coping with this load.
We can control the piece size by changing the RS numbers. Last test that ended right now should have been twice the piece size and halve the number of connections. According to our result that increased performance. This isnât a solution. It is just an observation for now.
i can confirm. Windows GUI. Win 10 pro.
2 cores per node and 1h ago 80-100% CPU usage.
Those are 2 cores of a 8/16 CPU (AMD Zen 2, AM4 platform)
delegated per VM instance, If someone going to say: Oh its VM fault - iâd say wonder what would happen if that wasnât in VM.
A potential CPU usage leak would eat me all the CPU usage of a whole workstation.
With VM, at least i can limit cores per node like that. now (1 node, 1 HDD, 2 CPU cores)
Edit:
Normally 1 VM storj instance uses 20-40% CPU load, even with quite high traffic like now (ingres now is 30-45% network, that is high, past months normal was like 3-10%)
That is 2 core per VM.
The high CPU % occurs only when network was tested up to 100% its ability, like yesterday, so not something that occurs often, but rather rare, but still.
Thatâs interesting. Mine had a peak of bandwidth use but CPU and IOWait didnât really go very high.
I wonder if my nodes were selected less often with the new choice of 2 selection criteriaâŚ
In my setup, I do not see big cpu usage on nodes itself, but antivirus eat significantly more that usual, and RAM consumption more than usual 300-500 mb, but if there is lot of connections it is OK for me, as my setup in node buffer is 4mb
There are bare metal nodes as far as I know. My observation is the same: the Windows service node doesnât consume much of CPU (unfortunately in Windows there is no easy way to get a CPU usage in percentage in CLI), but the task manager shows ~0.12% (8 CPU cores).
while docker nodes (they are running in the VM because itâs Docker Desktop for Windows):
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
29ce6d785933 wireguard 0.17% 24.16MiB / 24.81GiB 0.10% 1.98kB / 183B 0B / 0B 21
3d20fef76e67 storagenode5 6.69% 75.09MiB / 24.81GiB 0.30% 2.63MB / 78.5MB 0B / 0B 54
9fb28e5cf48a storagenode2 12.23% 668.8MiB / 24.81GiB 2.63% 10.3GB / 723MB 0B / 0B 229
And the Docker itself shows about 30% CPU usage (8 CPU cores).
Test results are a step closer to our target and this time we can keep it running for hours without any errors on the other satellites. That is good. Our target gets more and more in reach.
We are working on a different success tracker. Deployment on a Friday is a bad idea so it will have to wait for Monday. There is a good description in the PR what the differences are: https://review.dev.storj.io/c/storj/storj/+/13308
To me it looks like we might have a problem with the number of connections now. If I run a single node I have almost 100% success rate. If I add a second node on the same IP they split the load and still almost 100% success rate. I can continue like that up until 4 nodes. If I add more nodes they start to get long tail canceled. My reading of this is that TCP fast open or connection pooling gives me 100% success rate for a single node but the more nodes I add the higher the chance that a new connection needs to be established and that makes me lose the race. Not much so I could as well just ignore it but it is visible on my grafana dashboard.