Node monitoring

Someone was talking about monitoring for storj node. So I decided to share, what do I personally find interesting / useful to monitor.

First of all, basic stuff - CPU load, memory usage, and disk space:

image
image
image

Next, FS dedicated for storj delta (change per minute):

image

Network traffic, in/out:

Count of new lines per minute in node log:

Socket reachable & process running:

Battery:

For the last 3, I have automated SMS whenever something goes wrong.

Finally, output of earnings.py parsed:

And raw data for the same earnings.py:

All set in Zabbix, under one dashboard.

6 Likes

I’m using the InfluxData TICK stack to monitor my home systems.

Zabbix looks awesome and runs on a Pi I see!

I have MRTG for comparatively crude monitoring so it’s tempting to try this … can I ask, how hard it to setup … ?
I don’t want to have to dedicate too much time to it …

Thanks for sharing !

Installation is quick. For the configuration you should take a few hours to understand and configure Zabbix.
There is still Netdata which is very easy to install.

1 Like

I have a similar setup with Zabbix. Metrics are gathered from the SNO API, the log files, and an API proxy I wrote to smooth out differences between versions, add some extra reporting against the database, and add satellite discovery via Zabbix LLD (new satellites are automatically added to Zabbix as soon as they appear in the SNO API). I have a dynamic screen I can use to view a summary of each node.

I have a trigger for vetting status. This has the benefit of alerting me when a node becomes vetted on a specific satellite, as well as letting me see an overview of vetting status by looking at the problem matrix (but you have to get used to reading it as a negation – blue boxes show node/satellite combinations that are NOT vetted):

I do not use Watchtower, so I additionally have a trigger to tell me when each node can be updated.

1 Like

I really like your setup @cdhowie! Can you share the Zabbix template?

I was just looking at this :-

Very nice, thank you!
Just 2 questions:

  • Why monitoring battery? I am wondering if a laptop is really a good hardware for SNO
  • Any insights on how to monitor disk health (e.g. SMART systems)? Especially on a Virtual Machine hosted on a homelab?

Thanks!

In favour:- A storj node is very light weight so it’s not going to be a problem for a laptop.
Against:- Laptops are not really built with 24/7 operation in mind but as long as it stays relatively cool most seem to manage it.
WiFi is not recommended but most laptops have an Ethernet port for wired connection.

I think the big plus is the battery, it’s like having a built in UPS.

1 Like

My point is that a laptop requires more power (than a headless computer) mostly due to its screen. Of course, it’s possible to disable it and to customize it a little bit and it may be a good way to re-use old hardware. In my point of view, it is not the best indicated hardware for Storj use case.

I was thinking that the battery was a drawback because you will need to plug the computer all the time. Didn’t think the battery could be seen as a built-in UPS. You are right, that’s a very good point!

1 Like

Hi, i want the same for “Count of new lines per minute in node log”
For example. “if line not occures in 20minutes send me a sms”
You used Zabbix agent or agent 2 instalation?
Is sms feature build in, cheap?

And if You feel You can share more how to install and configure sms please share.
im gonna test it on windows 10 pro
im having problem with zabbix server. Dont know where to instal iit, i have only windowses 10 and 7 computers, and no webserver , u for some ideas for easy server setup somewhow?