It is tempting. I will keep that in mind as a backup plan. But first I will try out this: storagenode/blobstore: blobstore with caching file stat information (… · storj/storj@2fceb6c · GitHub
At what version it will start working?7
found, 1.108 version looks like.
But what mean hot cache? it will be in ram or nvme possible to add?
And the winner is the new blobstore cache. On a small node it gives me the same gain but it isn’t limited by RAM.
In about 2 weeks I will have a 5TB node migrated to ext4. Lets see how that performs in comparison.
so you updated on 1.108? it runs faster? because of cache?
5 Gb NIC on raspi Full 5 Gigabit Ethernet on Raspberry Pi 5 with iocrest Realtek RTL8126 adapter - Jiri Brejcha
I’m sorry, but for what? Your bottleneck is a USB to HDD middleware…
Depends on what you mean by that, but it needs to be assembled first just like L2ARC.
I reverted to the mirrored special device. And it’s working marvellous, to be honest:
root@VM-HOST:~# lsblk -o PATH,SIZE,LABEL,PARTLABEL,UUID,VENDOR,FSTYPE,rota
PATH SIZE LABEL PARTLABEL UUID VENDOR FSTYPE ROTA
/dev/sda 931,5G ATA 0
/dev/sda1 111,8G VM-HOST VM-HOST-1 9547310d-8093-4e66-accc-f9a7d608bee4 btrfs 0
/dev/sda15 15G storjdata4 STORJ4-META 12932962326981665985 zfs_member 0
/dev/sda16 5G storjdata22 STORJ22-META 17156441812472959081 zfs_member 0
/dev/sda17 15G storjdata10 STORJ10-META 9667372469621822347 zfs_member 0
/dev/sda18 3G storjdata11 STORJ11-META 3131107259782492802 zfs_member 0
/dev/sda19 3G storjdata16 STORJ16-META 10625861142284363320 zfs_member 0
/dev/sda20 15G storjdata18 STORJ18-META 6868956751607392789 zfs_member 0
/dev/sda21 8G storjdata6 STORJ6-META 5312916681478188263 zfs_member 0
/dev/sda22 8G storjdata9 STORJ9-META 5253881928466273503 zfs_member 0
# cut some irrelevant stuff
/dev/sdf 1,4T SAMSUNG 1
/dev/sdf1 1,4T storjdata6 zfs-1391287a2e5fd9d2 5312916681478188263 zfs_member 1
/dev/sdf9 8M 1
/dev/sdg 1,4T Hitachi 1
/dev/sdg1 1,4T storjdata9 zfs-ace96bd612c38442 5253881928466273503 zfs_member 1
/dev/sdg9 8M 1
# irrelevant
/dev/sdi 2,7T WD 1
/dev/sdi1 2,7T storjdata4 zfs-07f56700262ef24b 12932962326981665985 zfs_member 1
/dev/sdi9 8M 1
/dev/sdj 2,7T WD 1
/dev/sdj1 2,7T storjdata18 zfs-13156e834a813ec9 6868956751607392789 zfs_member 1
/dev/sdj9 8M 1
/dev/sdk 931,5G Samsung 1
/dev/sdk1 931,5G storjdata22 zfs-7627b31b68853fd3 17156441812472959081 zfs_member 1
/dev/sdk9 8M 1
/dev/sdl 477,5G Mass 1
/dev/sdl1 477,5G storjdata11 zfs-46912d350c39ebb9 3131107259782492802 zfs_member 1
/dev/sdl9 8M 1
/dev/sdm 2,7T BUFFALO 1
/dev/sdm1 2,7T storjdata10 zfs-0e9b458b17d1abae 9667372469621822347 zfs_member 1
/dev/sdm9 64M 1
/dev/sdn 476,9G Mass 1
/dev/sdn1 476,9G storjdata16 zfs-685857c77212e77b 10625861142284363320 zfs_member 1
/dev/sdn9 8M 1
/dev/sdo 476,9G Realtek 0
/dev/sdo1 476,9G STORJ-DATA STORJ17-DATA 57ce79f6-203e-442c-bfc8-22a9e9e75c1a ext4 0
/dev/zram0 44,3G 0
/dev/nvme0n1 1,8T 0
/dev/nvme0n1p1 112,2G VM-HOST VM-HOST 9547310d-8093-4e66-accc-f9a7d608bee4 btrfs 0
/dev/nvme0n1p2 512M EFI 1066-2F4F vfat 0
/dev/nvme0n1p3 1,3T STORJ-DATA STORJ23-DATA 72a134ac-e240-4477-8e0d-f5d2acb36ccf xfs 0
/dev/nvme0n1p15 15G storjdata4 STORJ4-METAD 12932962326981665985 zfs_member 0
/dev/nvme0n1p16 5G storjdata22 STORJ22-METAD 17156441812472959081 zfs_member 0
/dev/nvme0n1p17 15G storjdata10 STORJ10-METAD 9667372469621822347 zfs_member 0
/dev/nvme0n1p18 3G storjdata11 STORJ11-METAD 3131107259782492802 zfs_member 0
/dev/nvme0n1p19 3G storjdata16 STORJ16-METAD 10625861142284363320 zfs_member 0
/dev/nvme0n1p20 15G storjdata18 STORJ18-METAD 6868956751607392789 zfs_member 0
/dev/nvme0n1p21 8G storjdata6 STORJ6-METAD 5312916681478188263 zfs_member 0
/dev/nvme0n1p22 8G storjdata9 STORJ9-METAD 5253881928466273503 zfs_member 0
root@VM-HOST:~# zpool list -v
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
storjdata10 2.73T 2.35T 393G - - 6% 85% 1.00x ONLINE -
zfs-0e9b458b17d1abae 2.73T 2.34T 385G - - 6% 86.2% - ONLINE
special - - - - - - - - -
mirror-1 14.5G 6.77G 7.73G - - 62% 46.7% - ONLINE
STORJ10-METAD 15G - - - - - - - ONLINE
STORJ10-META 15G - - - - - - - ONLINE
storjdata11 479G 403G 76.2G - - 41% 84% 1.00x ONLINE -
zfs-46912d350c39ebb9 477G 401G 74.7G - - 41% 84.3% - ONLINE
special - - - - - - - - -
mirror-1 2.75G 1.25G 1.50G - - 75% 45.3% - ONLINE
STORJ11-METAD 3G - - - - - - - ONLINE
STORJ11-META 3G - - - - - - - ONLINE
storjdata16 479G 401G 77.6G - - 26% 83% 1.00x ONLINE -
zfs-685857c77212e77b 477G 400G 76.0G - - 26% 84.0% - ONLINE
special - - - - - - - - -
mirror-1 2.75G 1.17G 1.58G - - 63% 42.4% - ONLINE
STORJ16-METAD 3G - - - - - - - ONLINE
STORJ16-META 3G - - - - - - - ONLINE
storjdata18 2.73T 2.35T 397G - - 7% 85% 1.00x ONLINE -
zfs-13156e834a813ec9 2.73T 2.34T 387G - - 7% 86.1% - ONLINE
special - - - - - - - - -
mirror-1 14.5G 5.16G 9.34G - - 69% 35.6% - ONLINE
STORJ18-METAD 15G - - - - - - - ONLINE
STORJ18-META 15G - - - - - - - ONLINE
storjdata22 932G 805G 127G - - 5% 86% 1.00x ONLINE -
zfs-7627b31b68853fd3 932G 804G 124G - - 5% 86.6% - ONLINE
special - - - - - - - - -
mirror-1 4.50G 1.60G 2.90G - - 70% 35.5% - ONLINE
STORJ22-METAD 5G - - - - - - - ONLINE
STORJ22-META 5G - - - - - - - ONLINE
storjdata4 2.73T 2.35T 394G - - 3% 85% 1.00x ONLINE -
zfs-07f56700262ef24b 2.73T 2.34T 386G - - 3% 86.2% - ONLINE
special - - - - - - - - -
mirror-1 14.5G 6.34G 8.16G - - 64% 43.7% - ONLINE
STORJ4-METAD 15G - - - - - - - ONLINE
STORJ4-META 15G - - - - - - - ONLINE
storjdata6 1.37T 1.19T 179G - - 3% 87% 1.00x ONLINE -
zfs-1391287a2e5fd9d2 1.36T 1.19T 175G - - 3% 87.4% - ONLINE
special - - - - - - - - -
mirror-1 7.50G 3.56G 3.94G - - 64% 47.4% - ONLINE
STORJ6-METAD 8G - - - - - - - ONLINE
STORJ6-META 8G - - - - - - - ONLINE
storjdata9 1.37T 1.17T 206G - - 29% 85% 1.00x ONLINE -
zfs-ace96bd612c38442 1.36T 1.16T 203G - - 29% 85.4% - ONLINE
special - - - - - - - - -
mirror-1 7.50G 4.39G 3.11G - - 74% 58.6% - ONLINE
STORJ9-METAD 8G - - - - - - - ONLINE
STORJ9-META 8G - - - - - - - ONLINE
root@VM-HOST:~# iostat -x
Linux 6.1.0-22-amd64 (VM-HOST) 13-07-24 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
2,73 0,11 2,88 4,00 0,00 90,29
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
nvme0n1 195,54 1132,49 0,22 0,11 0,20 5,79 104,35 1336,26 2,78 2,59 0,22 12,81 0,00 0,00 0,00 0,00 0,00 0,00 1,90 0,31 0,06 1,63
sda 168,33 930,11 0,26 0,16 0,24 5,53 93,55 1335,05 3,41 3,52 0,22 14,27 0,00 0,00 0,00 0,00 0,00 0,00 2,16 0,73 0,06 1,83
sdb 0,76 85,09 0,01 1,00 1,71 111,38 0,89 129,21 0,01 0,60 2,02 144,49 0,00 0,00 0,00 0,00 0,00 0,00 0,02 0,86 0,00 0,05
sdc 0,79 75,01 0,01 1,80 1,97 94,87 0,92 132,53 0,01 1,02 2,23 144,68 0,00 0,00 0,00 0,00 0,00 0,00 0,02 0,90 0,00 0,06
sdd 4,38 131,04 0,01 0,22 14,50 29,91 9,86 737,84 0,06 0,59 23,17 74,83 0,00 0,00 0,00 0,00 0,00 0,00 0,04 47,91 0,29 5,26
sde 2,07 77,29 0,01 0,41 18,63 37,25 9,86 737,84 0,06 0,60 22,12 74,84 0,00 0,00 0,00 0,00 0,00 0,00 0,04 26,56 0,26 3,60
sdf 2,44 952,01 0,00 0,00 4,26 390,72 0,75 16,70 0,00 0,01 10,21 22,28 0,00 0,00 0,00 0,00 0,00 0,00 0,21 31,74 0,02 1,69
sdg 2,32 476,72 0,00 0,00 4,03 205,39 0,71 41,62 0,00 0,01 9,67 59,03 0,00 0,00 0,00 0,00 0,00 0,00 0,17 26,50 0,02 1,40
sdh 0,48 55,18 0,01 2,78 2,62 115,67 0,28 86,69 0,00 1,76 3,29 313,58 0,00 0,00 0,00 0,00 0,00 0,00 0,02 1,08 0,00 0,04
sdi 3,00 1053,39 0,00 0,02 8,32 351,33 0,76 68,36 0,00 0,02 1,40 90,53 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,03 2,38
sdj 2,92 818,77 0,00 0,02 10,10 280,35 0,94 96,93 0,00 0,02 1,21 103,42 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,03 2,83
sdk 2,25 195,91 0,02 0,73 10,40 87,13 0,62 39,08 0,00 0,00 2,34 63,02 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,02 1,58
sdl 2,58 235,84 0,00 0,02 10,24 91,30 0,48 6,90 0,00 0,00 14,62 14,41 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,03 2,10
sdm 5,07 1862,27 0,00 0,02 4,50 367,24 0,84 23,57 0,00 0,01 0,96 28,04 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,02 2,05
sdn 1,56 145,52 0,00 0,03 10,02 93,12 0,24 5,49 0,00 0,01 12,06 23,02 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,02 1,14
sdo 1,51 21,19 1,87 55,34 2,77 14,06 8,32 566,73 6,70 44,61 2,96 68,11 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,03 2,43
zram0 30,72 122,90 0,00 0,00 0,01 4,00 50,50 224,88 0,00 0,00 0,02 4,45 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,18
Before I reverted, I had an IO-wait of 60+% and idle-time of 15%. Utilization of hard drives was all 70+%. This is a huge improvement, I actually couldn’t pretell beforehand.
As you can see, I only have a 5GB/TB ratio for the special devices with recordsize 512kB. And as you can set, it’s only used about 50% of the data disk usage.
This is my making:
zpool create -o ashift=12 -O compress=lz4 -O atime=off -O primarycache=metadata -O sync=disabled -m /storj/nd10 -O xattr=off -O redundant_metadata=some -O recordsize=512k storjdata10 /dev/sdn -f
zpool add storjdata10 -o ashift=12 special mirror /dev/disk/by-partlabel/STORJ10-METAD /dev/disk/by-partlabel/STORJ10-META -f
Why, sounds actually like nonsensical to me. Because 5Gbps = 625MB/s. Give the fact a storagenode is almost all random IO, you could add easily 20 hard drives (probably even 60) before saturating the USB bandwidth.
If you would have a special device before them?
But probably you are right, if they now uses different lanes for USB and other devices.
Even if not, there are not do many devices aside from SSD, video (cards), HDMI over USB, … that can saturate the while bandwidth; all not applicable even in Pi <=4.
So, you are truly believe, that your HDDs can handle 5Gbps of random I/O? Even if we would assume, that all NIC 5Gbps traffic would go to the node(s), without a slow down by other devices?
Did I say so, somewhere?
I only said that you probably need at least 20 but more likely over 60 HDDs to saturate USB 3.0 bandwidth with random IO.
Probably more than that. I’ve found a benchmark which shows you can do 50k IOPS with USB 3 with a single flash storage device. This would translate to 200 HDDs worth of IOPS. A more interesting question would be whether RPi5 is capable of generating this kind of traffic.