Offline email notification from StorjLabs : KO?

Pac · March 16, 2020, 6:36pm

Hey there

I received an email from StorjLabs once because my node had gone offline.
So I thought “great, we might not need uptimeRobot or similar services anymore”.

However, my node crashed yesterday.
As I hadn’t stopped my uptimeRobot monitoring, it did eventually notify me that my node went down.

We’re 24h later and I still haven’t received anything from StorjLabs to let me know my node’s unreachable. Is this normal?

I am away from home, and my node is completely unreachable… I’ll be able to check what’s going on tomorrow
By the way, that’s another good example of situation where enforcing a maximum down time of 5 hours would have made me very sad

anon27637763 · March 16, 2020, 7:08pm

If you want to check whether or not your node is available online, you can use the openssl commandline program…

openssl s_client [your.node.ip]:28967

This command will connect to your node’s storage node port and output information regarding your node’s TLS cert.

If you are running GNU/Linux, you could combine openssl s_client and a text based MUA such as mutt to email you if no connection is found… run a check every 30 minutes using cron or systemd, and you’re all set with a simple node status checker. You could even run the check from the same LAN host, since the openssl command would connect to your WAN address.

Alexey · March 16, 2020, 7:35pm

This is make sense only from outside of the local network.

anon27637763 · March 16, 2020, 7:42pm

A connection started on the local host will traverse the WAN and back…

These two commands should produce the same output… if they do, then your node is running and online…

openssl x509 -inform pem -in ~/.local/share/storj/identity/storagenode/identity.cert
openssl s_client -showcerts -connect [your.node.ip]:28967 2>/dev/null | openssl x509 -inform pem

Alexey · March 16, 2020, 7:44pm

Only if router is able to do a hairpin NAT. Not all routers are able to do so. And this will prove only that locally it is working.
For example - you can specify the node’s internal IP as your “external” address and this command will tell you that your node is available. Also, you can specify the local domain name of your node as an “external” address and will get the successful result.
The same is going for the router-specific DDNS hostname (some routers have integrated DDNS systems hosted by their vendors), it could be resolved by router’s DNS cache but doesn’t propagated to outside DNSes.
In such cases it will not be available from outside. So, it’s better to check your node only from outside your network.

anon27637763 · March 16, 2020, 8:21pm

True… but, the commands above will ensure that a node is running and is available on the WAN interface, even if the router resolves the address locally.

It may be that the rest of the Internet can’t connect to the node’s WAN interface, but that’s not likely the fault of the node operator…

In the OP’s example, the node crashed. Checking for the TLS cert availability on the node’s external IP address would have caught OP’s problem… whether or not the rest of the Internet could have connected.

twl · March 16, 2020, 9:40pm

That is why external monitoring is the safer way to go generally, considering you trust the external application used for that

Pac · March 16, 2020, 10:25pm

I did try this, and it fails after a pretty long time, with:

140284712604096:error:0200206E:system library:connect:Connection timed out:../crypto/bio/b_sock2.c:110:
140284712604096:error:2008A067:BIO routines:BIO_connect:connect error:../crypto/bio/b_sock2.c:111:
connect:errno=110

I’m pretty sure my node’s completely off-grid.
I’ll investigate this further tomorrow when I have a physical access to it.

Thanks for your suggestions/replies
But I still think I should have received a notification from SotrjLabs ^^’

anon27637763 · March 16, 2020, 10:42pm

I script-ified my thoughts…

You can copy your identity.cert to a laptop or something. It’s perfectly safe to have just the cert. Don’t carry around the private keys though… And then you can run this script from any Internet connected host that has the openssl command line client and sha256sum … these are fairly standard for most GNU/Linux distros.

The script hashes your locally stored server level identity.cert , retrieves the node’s server cert and hashes that… compares the hashes and gives you some top level message.

There’s no error checking on the openssl commands. So, unexpected errors should be expected if your node is offline for whatever reason.

Basic Script below:

#!/bin/bash

# Check if there are two arguments
if [ ! $2 ]; then
   echo "Usage: $0 [path/to/identity.cert] [WAN.ip.address]"
   exit 1
fi
local=$(openssl x509 -inform pem -in "$1/identity.cert" |sha256sum)
remote=$(openssl s_client -showcerts -connect "$2":28967 2>/dev/null |openssl x509 -inform pem |sha256sum)
if [ "$local" = "$remote" ];
then
   message=$(echo -e "TLS Certs Match\nNode Online")
else
   message=$(echo -e "TLS Certs don't Match\nNode Offline")
fi
echo "$message"