Node monitoring

Someone was talking about monitoring for storj node. So I decided to share, what do I personally find interesting / useful to monitor.

First of all, basic stuff - CPU load, memory usage, and disk space:

image
image
image

Next, FS dedicated for storj delta (change per minute):

image

Network traffic, in/out:

Count of new lines per minute in node log:

Socket reachable & process running:

Battery:

For the last 3, I have automated SMS whenever something goes wrong.

Finally, output of earnings.py parsed:

And raw data for the same earnings.py:

All set in Zabbix, under one dashboard.

6 Likes

I’m using the InfluxData TICK stack to monitor my home systems.

Zabbix looks awesome and runs on a Pi I see!

I have MRTG for comparatively crude monitoring so it’s tempting to try this … can I ask, how hard it to setup … ?
I don’t want to have to dedicate too much time to it …

Thanks for sharing !

Installation is quick. For the configuration you should take a few hours to understand and configure Zabbix.
There is still Netdata which is very easy to install.

1 Like

I have a similar setup with Zabbix. Metrics are gathered from the SNO API, the log files, and an API proxy I wrote to smooth out differences between versions, add some extra reporting against the database, and add satellite discovery via Zabbix LLD (new satellites are automatically added to Zabbix as soon as they appear in the SNO API). I have a dynamic screen I can use to view a summary of each node.

I have a trigger for vetting status. This has the benefit of alerting me when a node becomes vetted on a specific satellite, as well as letting me see an overview of vetting status by looking at the problem matrix (but you have to get used to reading it as a negation – blue boxes show node/satellite combinations that are NOT vetted):

I do not use Watchtower, so I additionally have a trigger to tell me when each node can be updated.

1 Like

I really like your setup @cdhowie! Can you share the Zabbix template?