Best configuration for potato nodes and subpar hardware

pangolin · June 28, 2024, 4:46pm

Isn’t recordsize just a max size if compression is enabled. So how do we know how much is really needed?

littleskunk · June 28, 2024, 4:53pm

Not sure If that still counts as a potato node.

cav · June 28, 2024, 4:59pm

It would be nice if STORJ added a node performance indicator along with the uptime indicator, this would give a real idea of how STORJ values your node.

littleskunk · June 28, 2024, 5:01pm

It looks like we can compare our nodes by success rate.

JWvdV · June 28, 2024, 7:45pm

By this:

root@HOST:~# zpool list -v
NAME                                             SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
storjdata11                                      479G   416G  62.7G        -         -     5%    86%  1.00x    ONLINE  -
  sdj                                            477G   415G  61.0G        -         -     5%  87.2%      -    ONLINE
special                                             -      -      -        -         -      -      -      -  -
  mirror-1                                      2.75G  1.01G  1.74G        -         -    33%  36.8%      -    ONLINE
    STORJ11-METAD                                  3G      -      -        -         -      -      -      -    ONLINE
    STORJ11-META                                   3G      -      -        -         -      -      -      -    ONLINE
storjdata16                                      479G   449G  29.6G        -         -     5%    93%  1.00x    ONLINE  -
  sdb                                            477G   448G  28.1G        -         -     5%  94.1%      -    ONLINE
special                                             -      -      -        -         -      -      -      -  -
  mirror-1                                      2.75G  1.25G  1.50G        -         -    26%  45.5%      -    ONLINE
    STORJ16-METAD                                  3G      -      -        -         -      -      -      -    ONLINE
    STORJ16-META                                   3G      -      -        -         -      -      -      -    ONLINE
storjdata6                                       932G   800G   133G        -         -     1%    85%  1.00x    ONLINE  -
  ata-APPLE_HDD_HTS541010A9E662_JD8002D825PKGD   932G   798G   130G        -         -     1%  86.0%      -    ONLINE
special                                             -      -      -        -         -      -      -      -  -
  mirror-1                                      4.50G  1.77G  2.73G        -         -    58%  39.4%      -    ONLINE
    STORJ6-METAD                                   5G      -      -        -         -      -      -      -    ONLINE
    STORJ6-META                                    5G      -      -        -         -      -      -      -    ONLINE

root@HOST:~# zfs  get recordsize
NAME         PROPERTY    VALUE    SOURCE
storjdata11  recordsize  512K     local
storjdata16  recordsize  512K     local
storjdata6   recordsize  512K     local

11 and 16 have ashift=9, while metadata and 6 have ashift=12

And this:

root@N100-DEV:~# zpool list -v
NAME                                                  SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
storjdata1                                           1.37T  1.25T   116G        -         -     5%    91%  1.00x    ONLINE  -
  sdi                                                1.36T  1.25T   112G        -         -     5%  92.0%      -    ONLINE
special                                                  -      -      -        -         -      -      -      -  -
  mirror-1                                           7.50G  3.40G  4.10G        -         -    38%  45.4%      -    ONLINE
    sda3                                                8G      -      -        -         -      -      -      -    ONLINE
    sdb9                                                8G      -      -        -         -      -      -      -    ONLINE
storjdata3                                           4.57T   637G  3.95T        -         -     0%    13%  1.00x    ONLINE  -
  sdk1                                               4.55T   635G  3.93T        -         -     0%  13.6%      -    ONLINE
special                                                  -      -      -        -         -      -      -      -  -
  mirror-1                                           24.5G  1.65G  22.9G        -         -    31%  6.73%      -    ONLINE
    sda12                                              25G      -      -        -         -      -      -      -    ONLINE
    sdb12                                              25G      -      -        -         -      -      -      -    ONLINE
storjdata5                                           3.64T   439G  3.21T        -         -     0%    11%  1.00x    ONLINE  -
  sdm                                                3.64T   438G  3.20T        -         -     0%  11.8%      -    ONLINE
special                                                  -      -      -        -         -      -      -      -  -
  mirror-1                                           19.5G  1.14G  18.4G        -         -    35%  5.83%      -    ONLINE
    STORJ5-METAD                                       20G      -      -        -         -      -      -      -    ONLINE
    STORJ5-META                                        20G      -      -        -         -      -      -      -    ONLINE
storjdata7                                            932G   798G   135G        -         -     6%    85%  1.00x    ONLINE  -
  usb-WD_Elements_10A2_575835314141325835363531-0:0   931G   795G   133G        -         -     6%  85.7%      -    ONLINE
special                                                  -      -      -        -         -      -      -      -  -
  mirror-1                                           4.50G  2.71G  1.79G        -         -    75%  60.2%      -    ONLINE
    sda7                                                5G      -      -        -         -      -      -      -    ONLINE
    sdb11                                               5G      -      -        -         -      -      -      -    ONLINE
storjdata9                                           1.37T  1.26T   112G        -         -     8%    91%  1.00x    ONLINE  -
  sdh                                                1.36T  1.25T   109G        -         -     8%  92.2%      -    ONLINE
special                                                  -      -      -        -         -      -      -      -  -
  mirror-1                                           7.50G  4.31G  3.19G        -         -    52%  57.5%      -    ONLINE
    sda4                                                8G      -      -        -         -      -      -      -    ONLINE
    sdb10                                               8G      -      -        -         -      -      -      -    ONLINE

root@N100-DEV:~# zfs get recordsize
NAME        PROPERTY    VALUE    SOURCE
storjdata1  recordsize  512K     local
storjdata3  recordsize  512K     local
storjdata5  recordsize  512K     local
storjdata7  recordsize  256K     local
storjdata9  recordsize  256K     local

As you can see, those with recordsize 512K have special devs : data dev = 1 : 3. While with 256k it is 2 : 3.

This is my create:

zpool create -o ashift=(9|12) -O compress=lz4 -O atime=off -O primarycache=metadata -O sync=disabled -m /storj/mountpoint -O xattr=off -O redundant_metadata=some -O recordsize=512k storjdata[1-9][0-9]* /dev/disk/by-whatever

zpool add storjdata[1-9][0-9]* -o ashift=12 special mirror /dev/disk/by-partlabel/STORJ[1-9][0-9]*-METAD /dev/disk/by-partlabel/STORJ[1-9][0-9]*-META -f

Mind the ‘some’ redundant metadata here.

JWvdV · June 28, 2024, 7:48pm

Uhm… How do you mean…?

JWvdV · June 28, 2024, 7:52pm

Depends on your definition. These devices have both quite a meager CPU (N100 and i3-10110T), many nodes on it with drives USB-attached which are most of them SMR.

And I’m quite sure, the potato is for the biggest part defined by its drives.

Toyoo · June 29, 2024, 10:40pm

I assumed so far that a potato node is one which doesn’t have means to cache file system metadata. E.g., for default-formatted ext4 nodes, less than 1 GB of RAM or good SSD cache per 1 TB of pieces.

But I guess the ideas differ here.

JWvdV · June 30, 2024, 6:55am

Even if it has the means, but not the IOPS to run the first walker in order to fill the cache then you’re toasted.

littleskunk · June 30, 2024, 9:20am

I believe you need some kind of Metadata cache these days. I don’t see a setting that would allow a pi5 with low ram and no SSD to run bigger nodes. You need ram or SSD caching otherwise garbage collection will take forever.

Toyoo · June 30, 2024, 10:21am

Requirements were recently updated, yet this was not added. Maybe it’s time?

Alexey · June 30, 2024, 11:08am

I still not convinced. I believe that even Pi3 can survive. It would not win all races, of course, but it should be operational without all these complications like ZFS, SSD cache and so on.

ACarneiro · June 30, 2024, 11:12am

Well… that was in the “old days” with much less network load, though…

Alexey · June 30, 2024, 11:16am

Perhaps. However, I do not see any evidence here on the forum, that these nodes are failing completely.

JWvdV · June 30, 2024, 3:56pm

Indeed, it’s not about the processing power. And I’m inclined to think it’s also not about memory (can have 40+TB with 16GB memory). But it’s mainly the random IO, making the nodes struggle. So especially SMR, but even got some CMR drives that don’t keep up the pace, and have so many IO that all walkers aren’t finished in time and other timeouts at being triggered.

I do, I see many topics coming along all about the same subject: discrepancy between satellite reported data and drive data (= unfinished garbage collection / retain), logs with only updater repeating every hour (= killed node due to some timeout + STORJ bug it can happen unnoticed), …
It’s all to few IOPS, especially for reading meta data.

Alexey · July 1, 2024, 4:46am

Unfortunately these problems with a discrepancies may occur even on more powerful setups.