I received an email from StorjLabs once because my node had gone offline.
So I thought “great, we might not need uptimeRobot or similar services anymore”.
However, my node crashed yesterday.
As I hadn’t stopped my uptimeRobot monitoring, it did eventually notify me that my node went down.
We’re 24h later and I still haven’t received anything from StorjLabs to let me know my node’s unreachable. Is this normal?
I am away from home, and my node is completely unreachable… I’ll be able to check what’s going on tomorrow By the way, that’s another good example of situation where enforcing a maximum down time of 5 hours would have made me very sad
If you want to check whether or not your node is available online, you can use the openssl commandline program…
openssl s_client [your.node.ip]:28967
This command will connect to your node’s storage node port and output information regarding your node’s TLS cert.
If you are running GNU/Linux, you could combine openssl s_client and a text based MUA such as mutt to email you if no connection is found… run a check every 30 minutes using cron or systemd, and you’re all set with a simple node status checker. You could even run the check from the same LAN host, since the openssl command would connect to your WAN address.
Only if router is able to do a hairpin NAT. Not all routers are able to do so. And this will prove only that locally it is working.
For example - you can specify the node’s internal IP as your “external” address and this command will tell you that your node is available. Also, you can specify the local domain name of your node as an “external” address and will get the successful result.
The same is going for the router-specific DDNS hostname (some routers have integrated DDNS systems hosted by their vendors), it could be resolved by router’s DNS cache but doesn’t propagated to outside DNSes.
In such cases it will not be available from outside. So, it’s better to check your node only from outside your network.
True… but, the commands above will ensure that a node is running and is available on the WAN interface, even if the router resolves the address locally.
It may be that the rest of the Internet can’t connect to the node’s WAN interface, but that’s not likely the fault of the node operator…
In the OP’s example, the node crashed. Checking for the TLS cert availability on the node’s external IP address would have caught OP’s problem… whether or not the rest of the Internet could have connected.
You can copy your identity.cert to a laptop or something. It’s perfectly safe to have just the cert. Don’t carry around the private keys though… And then you can run this script from any Internet connected host that has the openssl command line client and sha256sum … these are fairly standard for most GNU/Linux distros.
The script hashes your locally stored server level identity.cert , retrieves the node’s server cert and hashes that… compares the hashes and gives you some top level message.
There’s no error checking on the openssl commands. So, unexpected errors should be expected if your node is offline for whatever reason.
Basic Script below:
#!/bin/bash
# Check if there are two arguments
if [ ! $2 ]; then
echo "Usage: $0 [path/to/identity.cert] [WAN.ip.address]"
exit 1
fi
local=$(openssl x509 -inform pem -in "$1/identity.cert" |sha256sum)
remote=$(openssl s_client -showcerts -connect "$2":28967 2>/dev/null |openssl x509 -inform pem |sha256sum)
if [ "$local" = "$remote" ];
then
message=$(echo -e "TLS Certs Match\nNode Online")
else
message=$(echo -e "TLS Certs don't Match\nNode Offline")
fi
echo "$message"