Benchmark different ZFS pools

The way I understand it, @s-t-o-r-j-user stated a hypothesis, not a fact. Note how they say:

cant say much about its influence on storagenode operations

I assume at this point you believe everyone who states a hypothesis also has to test it?

That is where we differ. I think it is bad form to just bombard others with papers or a 1h Youtube link. Don’t get me wrong, I very much like sources, but I my opinion that does not prevent you from writing your own thoughts in one or two sentences.

Not gonna do that. You don’t have a saying in that.

Yeah, at first. Then I said “My guess none (influence)” to which he replayed with “you are probably very wrong”

I made a statement regarding my hypothesis after conducting preliminary tests with RDP protocols, which are a mix of TCP, UDP, and QUIC respectively, along with several SSH streams. A quick look at my dashboard potentially implied that everything was becoming much cozier again. :- ) EDIT: Of course also storj load was present on the machines along a few other things.

I consider you a friend of mine so I had to think for a moment, and after this moment, I hope you do not mind me asking, what’s the problem @Toyoo?

Just that I’m rather allergic to confident technological statements that are not supported by measurements. It’s really easy enough to add a “My guess is…” or “I believe…” to a statement to make it clear it’s just an opinion, and not absolute truth.

1 Like

That’s one way of doing it, out of the many correct ones. And if I am recalling this whole conversation correctly, one should not have any reservations to me in this regard. I am not very enthusiastic to provide completely unnecessary explanations. Anyway, all this conversation unfortunately really got off-topic.

If not @Toyoo then maybe @BrightSilence could save the situation. All in all we are all neighbors.

Technically we’re competitors trying to get as much Storj data as possible :stuck_out_tongue:

TBH, I am mostly for some IT fun so kindly please do speak for yourself. :- )

Back to topic, here is how I plan on moving forward.

To speed up the transfer, I will stop the node. That should not get me disqualified.

I have a group of 4 and two drives that are identical. I will use single drives for pools.

pool1 just a plain disc, ARC only. This is where I will copy the data to originally with rsync.
pool2 has special vdev metadata.
pool3 has reboot persistent L2ARC.
pool4 is ext4. Not really a pool :slight_smile:

All ZFS are with record Size 1MiB, Sync disabled, Atime off, lz4.
ext4 defaults and atime off? Any suggestions?
Database is always on the node SSD.

Steps:

  • copy data to pool1 with rsync
  • reboot
  • run test twice on pool1.
  • copy data to pool2
  • reboot
  • run test on pool2
  • destroy pool2 to get the SSD back an create pool3 with said SSD.
  • make L2ARC reboot persistant
  • copy data from pool2 to pool3
  • copy data from pool3 to pool4 to warm up L2ARC
  • reboot
  • run test on pool3 twice.
  • run test on pool4

if this is a VDEV, it’s a pool, but stacked: ext4 above ZFS :man_facepalming:

None of this is a vdev!

These will all be datasets. I don’t beliefe vdev to be useful for any kind of data expect VMs.
The fixed volblocksize has to many downsides. No, these are all datasets. Except “pool4” that is just a ext4 drive.

1 Like
  • copy data to pool2 with replication
  • reboot
  • run test twice on pool2.
  • copy data to pool1
  • reboot
  • run test on pool1
  • destroy pool2 to get the SSD back an create pool3 with said SSD.
  • make L2ARC reboot persistant
  • copy data from pool1 to pool3
  • copy data from pool3 to pool4 to warm up L2ARC
  • reboot
  • run test on pool3 twice.
  • run test on pool4