10 MB/s read max on RPi 4B -> how to increase?

Bivvo · February 12, 2024, 9:52pm

I have the following setup:

Icy Box IB-3740-C31
2x WD Red PLUS NAS, 10 TB, HDD, 3.5"
RPi 4B with SanDisk MAX ENDURANCE 32 GB

One HDD is at 95%, the other 80%.

Issue: CPU is at 100% constantly. HDDs seem to work well. From the beginning of the project, reading was at maximum 10 MB/s.

For better durability I’ll switch from SD to SSD.

How can I fix the IO topic?

Roxor · February 12, 2024, 10:03pm

10MB/s sounds like the drives are plugged into the Rpi’s USB2 (black) ports instead of USB3 (blue) ports?

Ambifacient · February 12, 2024, 11:06pm

Re 100% cpu usage I’m guessing a bulk of that is iowait, so it’s mostly idle waiting for disk IO to finish.

Maybe you can try moving databases to an external USB disk.

Toyoo · February 12, 2024, 11:34pm

How many IOPS do you see? Can you show the output of iostat -dmxst 30 with, let say, three sets of measurements (so, waiting for a minute)? It will look like:

Device             tps      MB/s    rqm/s   await  areq-sz  aqu-sz  %util
sda              15.37      0.61     3.36    8.78    40.69    0.13  10.98
sdb             666.30      7.80   837.99    1.88    11.99    1.25  68.21
sdc             223.57      2.26   122.04    1.63    10.35    0.36  38.04

02/13/2024 12:32:13 AM
Device             tps      MB/s    rqm/s   await  areq-sz  aqu-sz  %util
sda               8.43      0.41     0.47   85.03    49.33    0.72   7.13
sdb             269.83      3.01   196.67    1.24    11.41    0.34  35.13
sdc              22.37      0.41    13.07    2.44    18.72    0.05   6.35

02/13/2024 12:32:43 AM
Device             tps      MB/s    rqm/s   await  areq-sz  aqu-sz  %util
sda               7.63      0.26     0.30   91.13    35.27    0.70   6.52
sdb             201.43      2.04   133.87    1.11    10.37    0.22  27.96
sdc              21.23      0.46    26.40    2.35    22.18    0.05   6.05

The rqm/s column will tell you the IOPS.

knewman · February 13, 2024, 2:31am

USB 2.0 port was my first though as well.

Although, seeing as these are WD Reds, I wonder if these are SMR disks that are hitting a particularly challenging workload scenario. Can you post the model number of your disks?

daki82 · February 13, 2024, 6:40am

Assuming WD101EFAX or WD101EFBX,they are CMR.
Are there any 10TB SMR?

Bivvo · February 13, 2024, 7:32am

Thank you all!!

@Roxor @knewman USB 3.0 - xchecked a minute ago.

@daki82 another CMR with 1 TB for logging.

@Ambifacient DBs are placed on the SD.

@Toyoo Will check as soon as I’m on the desk.

Meanwhile 2 screenshots:

Bivvo · February 13, 2024, 11:46am

02/13/2024 12:40:43 PM
Device             tps      MB/s    rqm/s   await  areq-sz  aqu-sz  %util
mmcblk0          75.98      1.97     4.89    7.23    26.55    0.55  11.70
sda               7.20      0.82     2.87    2.50   116.69    0.03   2.48
sdc             147.85      5.51  1089.27   12.29    38.18    1.89  99.86
sdd             139.02      5.31  1133.26   12.75    39.10    1.81  99.87

02/13/2024 12:41:27 PM
Device             tps      MB/s    rqm/s   await  areq-sz  aqu-sz  %util
mmcblk0          75.98      1.97     4.89    7.23    26.55    0.55  11.70
sda               7.20      0.82     2.87    2.50   116.69    0.03   2.48
sdc             147.85      5.51  1089.28   12.29    38.18    1.89  99.86
sdd             139.02      5.31  1133.26   12.75    39.10    1.81  99.87

02/13/2024 12:46:08 PM
Device             tps      MB/s    rqm/s   await  areq-sz  aqu-sz  %util
mmcblk0          75.97      1.97     4.89    7.23    26.55    0.55  11.70
sda               7.20      0.82     2.87    2.50   116.68    0.03   2.48
sdc             147.85      5.51  1089.31   12.29    38.18    1.89  99.86
sdd             139.03      5.31  1133.28   12.75    39.11    1.81  99.87

@Toyoo

Bivvo · February 13, 2024, 3:26pm

@Toyoo

migration from SD to SSD succeeded, iostat looks similar:

02/13/2024 04:24:20 PM
Device             tps      MB/s    rqm/s   await  areq-sz  aqu-sz  %util
sda             205.24      9.34   373.87    2.71    46.62    0.56  21.07
sdb               3.41      0.06     6.56    7.58    18.81    0.03   2.24
sdc             118.39      2.73   506.86   13.21    23.62    1.58  81.30
sdd             125.86      3.44   679.36   12.70    27.97    1.62  83.67

02/13/2024 04:25:45 PM
Device             tps      MB/s    rqm/s   await  areq-sz  aqu-sz  %util
sda             190.11      8.43   333.74    2.65    45.43    0.51  19.69
sdb               3.32      0.06     5.95    7.34    18.24    0.03   2.14
sdc             121.89      2.76   509.87   13.19    23.20    1.63  83.33
sdd             129.92      3.43   671.47   12.59    27.02    1.65  85.44

02/13/2024 06:15:23 PM
Device             tps      MB/s    rqm/s   await  areq-sz  aqu-sz  %util
sda              58.51      2.70    40.02    3.80    47.19    0.22   5.21
sdb               3.77      0.25     1.84    3.18    68.58    0.02   1.50
sdc             133.70      4.87   998.72   13.27    37.31    1.80  98.19
sdd             159.27      6.03  1265.61   11.57    38.76    1.86  98.42

Toyoo · February 13, 2024, 10:10pm

Typical HDDs are capable of around 250 IOPS. Your system seems to demand more than 1k IOPS on the sdc and sdd drives, and this is a problem. Moving databases seems to have helped a bit, but it’s still way above the capacity.

One potential reason for an elevated number of I/O operations is a file walker, then you would see reduction of IOPS queued after it finishes. Another is simply not having enough RAM to cache file metadata, with ext4 you should aim for 1 GB per 1 TB on node files. There are probably more.

Bivvo · February 14, 2024, 3:22pm

From internet research, I think this is due to USB 3 limitations of RPi 4B in general. I think I won’t solve that at all…

I remember, I’ve disabled SWAP and reduced logging to almost zero, while the SD was inserted / used. Just as a reference here and here.

UPDATE: reverted the settings linked above some minutes ago.

Questions:

Could ZRAM help?
Could the noatime flag help in fstab?

UUID=60f9452a-c2da-48fd-a156-53d15e67102c /mnt/WD1003 ext4 defaults 0 2
UUID=2c691f8a-637d-49a5-8198-82b34661e658 /mnt/WD1001 ext4 defaults 0 2
UUID=e7d6dabc-57da-481d-aee0-29cf106966ec /mnt/WD1002 ext4 defaults 0 2

Journalling seems to be enabled. Can I disable that with an almost full HDD? That way?

sudo dumpe2fs /dev/sdb1 | more

dumpe2fs 1.46.2 (28-Feb-2021)
Filesystem volume name:   <none>
Last mounted on:          /config
Filesystem UUID:          60f9452a-c2da-48fd-a156-53d15e67102c
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_f
ile dir_nlink extra_isize metadata_csum
Filesystem flags:         unsigned_directory_hash 
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              61054976
Block count:              244190385
Reserved block count:     12209519
Overhead clusters:        4114684
Free blocks:              239531055
Free inodes:              61054912
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      1024
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Tue Feb  8 00:22:42 2022
Last mount time:          Wed Feb 14 19:07:33 2024
Last write time:          Wed Feb 14 19:07:33 2024
Mount count:              52
Maximum mount count:      -1
Last checked:             Tue Feb  8 00:22:42 2022
Check interval:           0 (<none>)
Lifetime writes:          420 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:	          256
Required extra isize:     32
Desired extra isize:      32
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      7bee9531-3ead-43f8-bd13-382e10c5fe63
Journal backup:           inode blocks
Checksum type:            crc32c
Checksum:                 0xe151e9ff
Journal features:         journal_incompat_revoke journal_64bit journal_checksum_v3
Total journal size:       1024M
Total journal blocks:     262144
Max transaction length:   262144
Fast commit length:       0
Journal sequence:         0x0095f4c2
Journal start:            1
Journal checksum type:    crc32c
Journal checksum:         0x4233dd82

Group 0: (Blocks 0-32767) csum 0x4bca [ITABLE_ZEROED]

Toyoo · February 14, 2024, 10:30pm

USB3 may have its own bottlenecks, but even if the drive was connected through SATA, it would still have a limit of about 250 IOPS. This limit comes from the basic physical properties of HDDs, regardless of connections.

If your CPU is fast enough, then maybe.

Yes.

You can, but given you have an SSD, it might be better to move the journal there.

Dunc4n1d4h0 · February 14, 2024, 11:16pm

Here for for comparison chart from my odroid-hc2 (board with sata connector):

02/15/2024 12:09:05 AM
Device             tps      MB/s    rqm/s   await  areq-sz  aqu-sz  %util
sda              96.91      1.74   760.13   11.55    18.34    1.50  94.24

02/15/2024 12:09:35 AM
Device             tps      MB/s    rqm/s   await  areq-sz  aqu-sz  %util
sda             151.23      6.27  1390.00   23.82    42.43    3.71  97.93

02/15/2024 12:10:05 AM
Device             tps      MB/s    rqm/s   await  areq-sz  aqu-sz  %util
sda             138.33      5.58  1257.37   28.95    41.29    4.16  99.50

disk is 6TB SMR? with about half used.
With storj usage is so high, that I had to lower storage2.max-concurrent-requests and even that doesn’t seem to help that much, as filewalker and gc-filewalker run for days.

Bivvo · February 15, 2024, 7:44am

Done.

How? Is there a secure way without formatting the HDDs nor compromising them?

Toyoo · February 15, 2024, 9:33am

This is a sketch of what you would need to do, but it skips necessary changes to fstab.

Craig · February 16, 2024, 3:32pm

I don’t have a lot of detail to offer right here as I’m just working off the top of my head, but what is the chipset in the dock? I know there are some issues with certain USB to SATA bridge chipsets and the RPi boards. I think particularly JMicron chipsets, JMS567 maybe? There’s a thing with them not being able to use full bandwidth.

Wish I could offer more right now but that’s all I can offer at the moment. Hopefully a good lead though.

daki82 · February 16, 2024, 7:28pm

Is it still so?

there can’t rarely be more than that, via storj.
filewalk consumes 1 MBs + the read of maximum internet upload speed.
i see no cenario where storj produces more than couple MB read per drive.

Bivvo · February 16, 2024, 8:35pm

Usually this:

And recently surprised with this:

daki82 · February 16, 2024, 8:42pm

Running one node on intel Jxxxx nuc, i would not be surprised about this stats.
does it run? if yes, nothing to fix.

my minipc would barely cope with a second drive. 1x 12tb near full.

Bivvo · February 16, 2024, 9:35pm

Here and there, “file not found” or “tcp” errors, but scores at 100%.

“File not found” from historic evolutions, nothing to think about. Long time ago.