High I/O usage each day

Hi there,

image

Using docker as SNO, i got 100% i/o usage once a day, this disk contains stored files only, database/containers are stored somewhere else.
Is there some kind of action i may take ?

Why do you want to do any action though?

If the high IO is disruptive for other things you are using the disk for (which you don’t)-- you can consider adding some caching solution so that the metadata fetches are not resulting in the disk IO.

This will depend on what OS, filesystem, etc, are you using.

Linux, debian, docker …
HD handling storage is usb3 connected.
Just a poor dev, not really experienced with this, that’s why i’m asking :slight_smile:

No problem, :slight_smile: this looks like periodic file scanning, and it goes as fast as it can, limited by the disk seek latency. During this time node is of course less responsive, but it’s short enough to not worry to much about it.

If you increase cache size (I don’t know how to do it on linux) so that the entire amount of metadata can fit in there – the first time its accessed it would be ingested, and the second time – fetched form ram, or your cache device. In reality, even the first pass will be accelerated due to pre-fetch.

Some filesystems, like zfs, take it a step further, letting you store metadata on the whole separate drive (usually a fast ssd), making even the first access fast, regardless of caching.

As a side effect of not having those high IO bursts during scans, your node becomes more responsive – time to first byte is almost halved, because there no latency associated with metadata fetch, only data retrieval from the disk (and this is further improved by allowing small files to be stored with the metadata – storagenode stores a lot of small files), and this has a potential to earn you more money though winning more races.

The scans then become few minutes of very high read IO from SSD (thousands of IOPS) and have no effect on performance.

In my opinion - everyone who is using a separate HDD to store node should switch to using ZFS with special VDEV. (Nodes of those who share extra unused storage from the main array, as it was intended, already benefit from all the optimizations done on the main array.).

1 Like

oh, fancy graph. what tools do you use to monitor?

1 Like

You may use

or Prometheus exporter

1 Like

I’ll have a look at ZFS, i’m still using ext4, maybe there are better perfs with kern 6 also.

2 Likes

zfs requires more memory and tuning, with one disk zfs volume it will be slower than ext4 despite the tuning and almost useless because of absence the autocorrection in the single disk setup. It makes sense only in RAID configurations (at least for storagenodes), however it will have less IOPS anyway. But if you want to use zfs features, then you could sacrifice the performance.
But please do your own research first.