Upcoming storage node improvements including benchmark tool

We already have a script any SNO can run, whenever they want, that shows if they’re winning most upload/download races… or have areas to improve. And it’s based on the success of client requests: the actions that directly influence payouts.

I can understand Storj needing internal rankings though: that could help their dev and sales teams.

3 Likes

yea but noone wants to run it, i tried and got some Powershell errors and give up. i rememebr back in days it worked, but then stopped and im currently unable to run any test, so will welcome with open hands any statistic if Storj can run it for me

I have go, I have gcc but what do I enter into cmd to run the test?
Additionally for the sqlite3 step:
image

We don’t have access to your node and are unable to run it for you. Running a storage node is easy. Optimizing it requires additional knowledge and there is no shortcut. We can provide help for setting up grafana or the new benchmark.

Installed 1.104.0-rc and skipped lazy used space because I saw the comment above about trash being fixed later today.

Installed 1.104.1 and enabled lazy used. All I can say is it flies compared to before. ~1h for ~5TB (dbs still on disk). Well done.

3 Likes

Without fsync patch? Maybe less than 5mib…

1 Like

You mean the lazy file walker is faster now? How long was it before? This is just a side effect. We didn’t touch the lazy file walker itself. We made the uploads cheaper so that the lazy file walker has more IOPs available to run a little faster.

Yea it’s a lot faster since it doesn’t have to pause all the time waiting for IO. I don’t remember exact figures, but it was > 3 hours for sure on that node (node runs on array that is used for other things as well).

3 Likes

Awesome. Thats a nice side effect.

1 Like

I’ve also manually upgraded one of my nodes on XFS to v1.104.1, and you can clearly observe the difference (upgrade was taken at around 00:43):


And this is with both garbage collection and trash emptying taking place, and garbage collection was occurring before the upgrade. The read operations are getting a big boost. The intermittency in the reads seems related to when the data is being flushed from memory, not 100% sure though.

Unfortunately I think a side effect of the bandwidth DB change is that the existing prometheus exporter no longer gives nice bandwidth charts in Grafana.

4 Likes

This graph is better than any tests to explain the improvement

PS: How long it takes usually to go live ver 1.104?

I don’t see that on my Grafana or storage node dashboard. To me it looks like it should continue to work just fine.

For reference I am using this exporter: GitHub - anclrii/Storj-Exporter: Prometheus exporter for monitoring Storj storage nodes

My graphs look like this:

It’s only picking up the periodic flushes. Are you using the same exporter?

No. I don’t like the idea of running some third party tool that basically gets full access to my storage node. Instead I use the build in metrics endpoint that works without any additional log scraping tool.

2 Likes

Ah gotcha. I will have to check that out.

In the meantime I’ll see if I can patch this.

2 Likes

Any hints on how to graph JSON data in Grafana? The exporter mentioned here is reading Storagenode API and exposing the data for Prometheus to scrape, but what would be a best way to do the same directly with Storagenode API?

Not storagenode API. Just the metrics endpoint with prometheus.

Thank you, didn’t know about that.
For anyone else wondering, it is at /metrics at the debug.addr port.

1 Like

try this suggestion: