Benchmark nodes

Hi, I have two nodes with two different VPN (I must use it on CGNAT network). One of this node seems very slow during downloading. Speedtest seems ok but I think is a delay or dns problem to slow down everything.
My question is: Can I benchmark my nodes? Can I simulate storj gateway test on my nodes?

personally i might try to switch the vpn’s around on the nodes…

that would indicate if one vpn is performing worse than another.
if that is not the case then it has to be the one node, that could be on a SMR HDD or something.

latency would usually be an indicator of a bad connection or stressed cpu, hdd or such things…
vpn’s use encryption and not all cpu’s can hardware offload such tasks.

if you are on linux i would recommend netdata it is very good for finding issues.

its an amazing tool and have helped me many times, it can be a bit resource heavy to run on smaller / older systems… so might want to turn it off again when you are done.

i’m sure there are similar good tools for windows, but nothing comes to mind right now.

I dont think I can emulate storj clients with netdata but simply my connection. I need to simulate clients requests understanding if I’m serving files with a decent “reactivity” or not

You could try capturing some incoming packets with wireshark, then replaying them with hping. Your node will obviously fail the request, as crypto will not match, but at least you can benchmark the establishment of the TCP connection/routing UDP and the first reply.

I think this will be very good featere from storj, if it will be posible to Benchmark nodes from different satelites. then node operators can think what they can do better, or if there is some bottleneck. This will be very healthy for the whole storj system. As node operators will start to understand why their node perform good or bad.

4 Likes

Data is transferred directly from the customers, not satellites. Measurement from the satellites’ location will be useless.

yes netdata is monitoring, but it can usually tell you if there is something misbehaving with the hardware or other odd behavior.

alternatively you could try to do read write and io tests on the HDD’s and see how they perform.
in most cases of nodes performing worse than one another, its the HDD’s that play the major factor even tho other things can impact ingress and downloads.

using storj as a benchmark, tho it can be done using the storj network private version, i don’t think anyone ever really tried that… and one would basically have to build a mini storj network to do that… so its just not really viable.

besides, tho it would be cool with an official benchmark tool from storj, thus far we haven’t really had much practical use for one, as most stuff can be benchmarked just fine using 3rd party software solutions.

but maybe you are right, its about time we got an official storj software suit recommendation of which 3rd party softwares to use for benchmarking node performance.
maybe a bit of a how to manual for the individual tests.

Then it cold be possibilities to send data to specific nodes for the test prepose. and read them.

It will help owners to improve nodes. It’s difficult to understand if you are going to “serve” your files fast or not during the whole process from client back to client.
It’s time to improve our performance to compete with s3 :slight_smile:

with programming it could ofc be possible to have like temporary benchmark pieces, and then a node could request a benchmark which other nodes performed on it.

ofc this sort of feature would need some limitations and possible protections against exploits.

and not sure such a feature would be high on storjlabs to do list…
also there would be the question of difficulty, as it might involve rather deep changes to code / options the node software provides.

No, it could not. Again, data is transferred from the customers, not satellites. If we run own uplinks for that purpose, you can only measure something for that one location. This benchmark will be useless, It would be like testing your internet speed with Speedtest - many ISP optimizes their routes to get fastest result with Speedtest, this would be the same. You would optimize your nodes for this one location, but will have worse results with the real customers from other locations.

3 Likes

A benchmark would be nice, it would probably show that there is no CPU or HDD bottleneck. Both CPU and HDD utilization is extremely low. I hate SMR drives but to be fair, even this things do not get utilized by storj.

Much more likely would be what @Alexey tries to explain. Data is coming from peers. Ask yourself, how does data get from person A to person B? Assuming person A uses ISP A and person B uses ISP B.

Maybe ISP A and B have a direct connection. That could be a single 10Gbit link that connects ALL customers from ISP A to ISP B. Or they don’t even have direct peering and they are only connected by a central exchange like DE-CIX. Which brings you to the next question, what connection does ISP A have to DE-CIX? And again, remember you share that connection with ALL other customers.

All that stuff is called peering and I would argue is more important than speed/bandwith.

Lets use a real world example: Downloading Cyberpunk 2077 at launch. A friend of mine has a 10Gbit Fiber. Another friend has 1Gbit from a different ISP. Guess who was faster? Jup, the 1Gbit one. Why? Bad peering. Your ISP also has internet connections to other networks. CDNs, content providers and so on. So it could be that when you download a game from GOG, you share one single 1Gbit connection from your ISP to the GOG servers with EVERY other customer of your ISP! To keep it short, we ignore cache servers and xgs-pon.
That is why in the end my 1Gbit friend got between 800-900mbit download speed, while the poor 10Gbit guy got 3mbit. No, this is not a typo! He really got 3mbit on a 10Gbit fiber connection!
So while his connection in theory is 10 times faster, in real life it was 300 times slower!

How can you test or benchmark peering?
Well, that is pretty hard. You can try at PeeringDB but that will not get you far. More realistic is to switch to real ISP that has good peering. How can you spot a good ISP? Not that easy, but for one, they offer real dual stack and not CGNAT. What definitely will not perform well is a VPN. A VPN will either cost way more than you earn with Storj or will be overbooked and have bad peering.

What does speedtest.net by ookla really benchmark?
If you go this webpage and look at the testserver, you can see that it is probably a server from your ISP. So what you are testing is a iperf speedtest between your device and a server in your “local ISP network”. But more interesting would be, how fast ist the speed between your device and mine? That would be a interesting benchmark. I for example am in Europe and have a 1Gbit fiber connection. My speedtest to Wyyerd fiber in California is 25mbit down and 400mbit up. That gives me a good estimation on how fast a Wyyerd customer can up and download data from me.

To come back to @agente, why do you need a VPN for CGNAT? I assume you have a public IPv6? Does Storj fully support IPv6?

1 Like

Ipv6 works but then you will serve only ipv6 clients, if I understand well.

My thoughts about performance testing is to start from simple and then trying to testing peering conditions. We can just start to test a file request from a specific location (maybe one of the gateways). This can help to understand if you have bottleneck somewhere in your configuration (vpn, smr, usb channel busy from other devices that slowdown everything… etc).
I think many nodes operators are not system administrators. We need simply tools for a better storj speed tuning. If speed matters…

The most important metric to test is peering. You can’t test peering because peering is different for every customer. All other metrics are not important because they are not a bottleneck. Your idea to test peering between our node and a gateway would not get you any insights, because you get data from multiple users and not a single gateway.

If speed matters

That is the thing, it does not :grin:
It could matter if utilization would be WAY higher.
By getting the data from multiple (underutilized) nodes, there is already enough speed.

Ipv6 works but then you will serve only ipv6 clients, if I understand well

That is correct. Hopefully one day utilization is so high, that you don’t need IPv4 users to fill your disks :slight_smile:

I think there was an idea posted on the forum some time ago to build a benchmark into storage node code. As in, each node would periodically test other nodes and share results, either to a satellite, or directly to the peer node. As there is now ~15k nodes all over the world, each node operator would get a lot of data.

It would be quite a lot of coding though, as well as adding some sort of social contract (how much do you trust other nodes to give true results?).

up/dl?
who will pay for this?

1 Like

This could implemented with a few lines of code, but why?

So we can waste bandwidth to test how fast my connection to other random nodes is, while these aren’t even the guys that will down or upload data from me?

There is wordplay in German for that:
Wer misst, misst Mist.
Translation: Who measures, measures rubbish

New social contract.

Yeah. It would still more useful than a synthetic benchmark from a single location.

There’s a saying, you never have the right data, so you need to leverage data you have (paraphrased).

And… Who need it? If you have 1 node with 2-5tb, you dont need another benchmark. You can benchmark network and disk with standard utils with google. If you have 10+ nodes and 50+tb, you dont need benchmark, becouse you can (and know how) benchmark your hardware by standard utils.

Network benchmark-torrents.
Hdd benchmark - torrents with low seeders and high leechers. And little script for create, read, delete thousands small files

I don’t know who needs it. I know I’d gladly trade off some of my bandwidth in exchange for some other nodes provide me additional insight into how well my nodes function.