HDD high load - DB problem?

skibboo · May 27, 2021, 7:19am

Hi team,

i am running on a Synology DS220+ docker image (1.29.3).
From time to time there is a 100% load on the HDD which seems not to come from the Syno processes.

Is there a possibility that the DB has a problem?
Or are there recurring checks on it?

Also, it takes about 15 seconds to open the dashboard on the browser.
Other nodes i have comes in about 3 seconds (on a Pi).

Thanks!

Pac · May 27, 2021, 7:40pm

How often does that happen? Storj storage mediums do get hammered by massive I/O requests when the filewalker rebrowses the entire set of stored pieces. But this happens only after:

A restart of the node.
An update to the latest version of the Storj node software… because it restarts the node.

If this happens more often than regular updates, then it could be that your node is regularly crashing and restarting. This can happen if your system runs out of RAM and kills any greedy process, which could be caused by:

A hard drive disk that cannot keep up, making the node stack incoming pieces in a wait queue until there’s not more RAM left
A failing hard drive not responding
Not enough RAM… ^^
…
It could also restart because of an error within the Storj node software itself (unlikely).

What does your dashboard show in the “uptime” section? If it’s abnormally low compared to your other nodes, it means it unexpectedly restarted.

If you configured your nodes to redirect its logs to a file, you could check whether it contains obvious restart sequences in it, at what dates and after what operations/errors.
If you did not, you should configure this so you can check whether the log file contains useful info just before the crash.

Here is how to configure that:

skibboo · May 27, 2021, 8:01pm

I am not monitoring all the time, but i have seen this many times now.
In fact, start of the container is not the issue, its 383h online actually.
Logs have not been redirected.

RAM is available, about 1 GB free.
Its being filled up and down normally.

I have migrated from an SMR Disk to a brand new Ironwolf,
because i thought this issue can point to SMR technology.

Pac · May 27, 2021, 8:44pm

It could have been the issue, good call.
Although the network haven’t really been putting high pressure on our disks lately, so SMR disks have been coping with the network activity for months now (mine at least, YMMV).

If your node isn’t unexpectedly restarting then I’m not sure

I don’t know what can be or cannot be installed on a Synology system, but it’d be probably worth checking what’s causing high IO pressure on your disk when the issue is happening with a tool like iotop (I think it needs root access).
Doesn’t your Synology’s web interface provide tools for checking its status and realtime activity?

BrightSilence · May 29, 2021, 9:24am

It’s not included. I tend to just use docker for such commands.

docker run --rm -ti --privileged --net=host --pid=host esnunes/iotop:latest

That will just pull a small image and run a container in which it launches iotop. The container will be removed automatically when you’re done.

Edit: I should add that you should be careful what you run with those settings as they basically give the container full access to the host. This image is based on this docker file. docker-iotop/Dockerfile at f9b96da3c2e3dc13fceaf092f9b5054ffefd313c · esnunes/docker-iotop · GitHub