Upcoming storage node improvements including benchmark tool

I just found something interesting (I do not know if I should post it to this thead or about the one about test data, but this is about performance, so…).

I upgraded my node VM to Debian 12 (it was Debian 10). Apparently something changed in the newever version because the VM started experiencing load spikes every 20 minutes, causing the incoming traffic to almost stop.

graph_image-3
(note the log scale)

I tried a bunch of things, like reducing the queue or core count, but it did not help. The load would shoot up for a short time with not much indicating what’s causing the load.
And then I found it (I think), apparently Debian 12 does async writes differently, I guess it accumulates a bunch of dirty data and tries to write it at the same time causing the load spike.

I restarted my node adding --filestore.force-sync=true to the command line and this fixed the problem

graph_image-3

No more load spikes. I went back to 8 queues/8 cores on 13:00 and my node seems to work good, even getting slightly more traffic (I have no idea if this is the max or if my node is still limited).
I probably should change the pool into mirrors, but cannot do so at the moment.

2 Likes