I don’t think you could even start a pool with a missing vdev to recover some data. Bu yes your node would be disqualified for two reasons:
The pool would not start.
Even if you managed to start it, it is likely that you would have lots of damaged files instead of a lot of intact files and some missing files. ZFS can write a large file across all drives, so there would be a piece of it missing. At worst you would have all files with a piece missing.
Oh, and the Storj network would repair the data just fine, but your node would be DQ.
It should be recovered by Storj, but all recovered pieces will go to any other node but yours. So, kind of “yes” and “no” for your setup. Your node(s) will be DQ permanently.
I think there is some misunderstanding here, the recommendation is to not run more than 12 disks in one VDEV.
A pool can have as many vdevs as necessary.
I had a random thought of a ZFS property that might improve performance
redundant_metadata=some (instead of the default “all”)
In theory this makes the files more fragile because an entire file could be lost if one block of the drive is bad. But in the scheme of storj, losing one file may not even be noticeable.
It seems like it might speed up metadata heavy operations. Maybe.
I have it turned on my drives and they are working normally, but I can’t evaluate if it’s really any faster.
Yes I’m pretty sure the raid 0 x500 drives was sarcasm Alexey
As for the redundant-metadata=some, I think the idea with zfs is the metadata really isn’t redundant any more. I’ve used the setting for about 24 hours now and haven’t really noticed a huge different in my life.
Doesn’t sound like a profound research project. But since you don’t mention a complete rebuild of meta data like a send-recv operation, I think this is quite useless.
(dragging my response to this over here: as mods are spicy about off-topic comments today… )
So it looks like used-space-filewalker started at 17:42… and all four satellites were completed by 18:03. So 21 minutes for around 1TB is probably 3-4million files: very impressive! And thank you for showing you logs
question here:
I’ve got Seagate Exos X18, which is pretty loud. Which type of additional VDEV - could reduce IO and make that drive more silent? When I’m looking at statistics - there are more writes than reads, so ZIL could help in this case. Or - am I wrong?
A ZFS Intent Log is only used for synchronous writes, and a couple months ago Storj nodes were switched to use async: so adding a ZIL won’t help. Eventually writes must make it to disk, and they tend to be small, so you may not to be able to reduce overall IO much… but you can still use a SSD with ZFS to speed up housekeeping tasks: like used-space-filewalker.
Quite a few here have used a ZFS “special metadata” device: so the millions of filenames/sizes can quickly be queried from SSD instead of hitting your HDD: which reduce most Storj internal tasks to seconds (instead of hours/days). I think 5GB/TB is a common config: so even small SSD partitions/devices make a difference. Search for more ZFS posts in the forum: Good Luck!
Or use a ssd as l2arc cache and set it to metadata only. The benefit is very much the same but you can remove it any time and redundancy is not needed.
Over time it becomes similar, yes. First L2ARC is only filling itself from evictions from ARC… which is purposely slow so can take a long time (whereas special-metadata has everything, 100% of the time). And by default L2ARC doesn’t survive reboots… so will take time to warm again each time you restart.
You can enable persistent L2ARC now… but that can really slow down boot times. But yes you want to mirror special-metadata for durability, while L2ARC is disposable. In all cases if metadata is in ARC it will be used from there first anyways. RAM always wins