Isn’t recordsize just a max size if compression is enabled. So how do we know how much is really needed?
Not sure If that still counts as a potato node.
It would be nice if STORJ added a node performance indicator along with the uptime indicator, this would give a real idea of how STORJ values your node.
It looks like we can compare our nodes by success rate.
By this:
root@HOST:~# zpool list -v
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
storjdata11 479G 416G 62.7G - - 5% 86% 1.00x ONLINE -
sdj 477G 415G 61.0G - - 5% 87.2% - ONLINE
special - - - - - - - - -
mirror-1 2.75G 1.01G 1.74G - - 33% 36.8% - ONLINE
STORJ11-METAD 3G - - - - - - - ONLINE
STORJ11-META 3G - - - - - - - ONLINE
storjdata16 479G 449G 29.6G - - 5% 93% 1.00x ONLINE -
sdb 477G 448G 28.1G - - 5% 94.1% - ONLINE
special - - - - - - - - -
mirror-1 2.75G 1.25G 1.50G - - 26% 45.5% - ONLINE
STORJ16-METAD 3G - - - - - - - ONLINE
STORJ16-META 3G - - - - - - - ONLINE
storjdata6 932G 800G 133G - - 1% 85% 1.00x ONLINE -
ata-APPLE_HDD_HTS541010A9E662_JD8002D825PKGD 932G 798G 130G - - 1% 86.0% - ONLINE
special - - - - - - - - -
mirror-1 4.50G 1.77G 2.73G - - 58% 39.4% - ONLINE
STORJ6-METAD 5G - - - - - - - ONLINE
STORJ6-META 5G - - - - - - - ONLINE
root@HOST:~# zfs get recordsize
NAME PROPERTY VALUE SOURCE
storjdata11 recordsize 512K local
storjdata16 recordsize 512K local
storjdata6 recordsize 512K local
11 and 16 have ashift=9, while metadata and 6 have ashift=12
And this:
root@N100-DEV:~# zpool list -v
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
storjdata1 1.37T 1.25T 116G - - 5% 91% 1.00x ONLINE -
sdi 1.36T 1.25T 112G - - 5% 92.0% - ONLINE
special - - - - - - - - -
mirror-1 7.50G 3.40G 4.10G - - 38% 45.4% - ONLINE
sda3 8G - - - - - - - ONLINE
sdb9 8G - - - - - - - ONLINE
storjdata3 4.57T 637G 3.95T - - 0% 13% 1.00x ONLINE -
sdk1 4.55T 635G 3.93T - - 0% 13.6% - ONLINE
special - - - - - - - - -
mirror-1 24.5G 1.65G 22.9G - - 31% 6.73% - ONLINE
sda12 25G - - - - - - - ONLINE
sdb12 25G - - - - - - - ONLINE
storjdata5 3.64T 439G 3.21T - - 0% 11% 1.00x ONLINE -
sdm 3.64T 438G 3.20T - - 0% 11.8% - ONLINE
special - - - - - - - - -
mirror-1 19.5G 1.14G 18.4G - - 35% 5.83% - ONLINE
STORJ5-METAD 20G - - - - - - - ONLINE
STORJ5-META 20G - - - - - - - ONLINE
storjdata7 932G 798G 135G - - 6% 85% 1.00x ONLINE -
usb-WD_Elements_10A2_575835314141325835363531-0:0 931G 795G 133G - - 6% 85.7% - ONLINE
special - - - - - - - - -
mirror-1 4.50G 2.71G 1.79G - - 75% 60.2% - ONLINE
sda7 5G - - - - - - - ONLINE
sdb11 5G - - - - - - - ONLINE
storjdata9 1.37T 1.26T 112G - - 8% 91% 1.00x ONLINE -
sdh 1.36T 1.25T 109G - - 8% 92.2% - ONLINE
special - - - - - - - - -
mirror-1 7.50G 4.31G 3.19G - - 52% 57.5% - ONLINE
sda4 8G - - - - - - - ONLINE
sdb10 8G - - - - - - - ONLINE
root@N100-DEV:~# zfs get recordsize
NAME PROPERTY VALUE SOURCE
storjdata1 recordsize 512K local
storjdata3 recordsize 512K local
storjdata5 recordsize 512K local
storjdata7 recordsize 256K local
storjdata9 recordsize 256K local
As you can see, those with recordsize 512K have special devs : data dev = 1 : 3. While with 256k it is 2 : 3.
This is my create:
zpool create -o ashift=(9|12) -O compress=lz4 -O atime=off -O primarycache=metadata -O sync=disabled -m /storj/mountpoint -O xattr=off -O redundant_metadata=some -O recordsize=512k storjdata[1-9][0-9]* /dev/disk/by-whatever
zpool add storjdata[1-9][0-9]* -o ashift=12 special mirror /dev/disk/by-partlabel/STORJ[1-9][0-9]*-METAD /dev/disk/by-partlabel/STORJ[1-9][0-9]*-META -f
Mind the ‘some’ redundant metadata here.
Uhm… How do you mean…?
Depends on your definition. These devices have both quite a meager CPU (N100 and i3-10110T), many nodes on it with drives USB-attached which are most of them SMR.
And I’m quite sure, the potato is for the biggest part defined by its drives.
I assumed so far that a potato node is one which doesn’t have means to cache file system metadata. E.g., for default-formatted ext4 nodes, less than 1 GB of RAM or good SSD cache per 1 TB of pieces.
But I guess the ideas differ here.
Even if it has the means, but not the IOPS to run the first walker in order to fill the cache then you’re toasted.
I believe you need some kind of Metadata cache these days. I don’t see a setting that would allow a pi5 with low ram and no SSD to run bigger nodes. You need ram or SSD caching otherwise garbage collection will take forever.
Requirements were recently updated, yet this was not added. Maybe it’s time?
I still not convinced. I believe that even Pi3 can survive. It would not win all races, of course, but it should be operational without all these complications like ZFS, SSD cache and so on.
Well… that was in the “old days” with much less network load, though…
Perhaps. However, I do not see any evidence here on the forum, that these nodes are failing completely.
Indeed, it’s not about the processing power. And I’m inclined to think it’s also not about memory (can have 40+TB with 16GB memory). But it’s mainly the random IO, making the nodes struggle. So especially SMR, but even got some CMR drives that don’t keep up the pace, and have so many IO that all walkers aren’t finished in time and other timeouts at being triggered.
I do, I see many topics coming along all about the same subject: discrepancy between satellite reported data and drive data (= unfinished garbage collection / retain), logs with only updater repeating every hour (= killed node due to some timeout + STORJ bug it can happen unnoticed), …
It’s all to few IOPS, especially for reading meta data.
Unfortunately these problems with a discrepancies may occur even on more powerful setups.