Can someone help me create a CURL command to check if nodes are online?

For each line in docker ps that beings with storj*, i want to curl the RPC port and if it is reporting OFFLINE, then i want it to give me an alert (Lets just say write it to a file).

That way I could run this on each of the storj machines instead of trying to do some centralized panel.

Any ideas? My bash skills suck.

Thanks in advance!

Try this:

#!/bin/bash
lastPinged=`/usr/bin/curl -s "127.0.0.1:14002/api/sno/" | /usr/bin/jq ".lastPinged" | cut -d'"' -f2`
lastTS=`date --date "$lastPinged" +%s`
now=`date +%s`
lastContact=`expr $now - $lastTS`

echo $lastContact

It gives you how long ago (in seconds) was the last ping. If it’s longer than a few seconds then your node is offline.

I tested it on debian 10. If your OS is different, you may have to change the paths.

1 Like

Thanks!!

Is it possible to put some sort of for loop, where it would look at docker ps and run this on the varying ports?

So like these are two examples of nodes running on one machine.

d6de00994941 storjlabs/storagenode:latest “/entrypoint” 9 minutes ago Up 9 minutes 0.0.0.0:14956->14002/tcp, 0.0.0.0:23876->28967/tcp storj-q156
3d411d8269d0 storjlabs/storagenode:latest “/entrypoint” 9 minutes ago Up 9 minutes 0.0.0.0:14955->14002/tcp, 0.0.0.0:23875->28967/tcp storj-q155

I would like the bash script to automatically load the port 14956 and 14955 in this case, then curl them…

Is there a preference for the output format?
Or would something like:

14956: 0
14955: 1

be ok?

1 Like
port=28967
host="x.x.x.x"
for b in $host
do
a=$(nmap -v $b -Pn -p $port  |grep -e "tcp open"  |wc -l)
if [ $a -eq 0 ];
then
....
done

I really just want a list of node names that are offline so i can pipe it into an email or a telegram message or something. Have it run every hour and get real alerts when things are offline.

oh and thanks for the help hah

Thanks for the reply.

More interested in curl’ing the actual node’s API/RPC/wahtever ya call it. As i have ports open and such and sometimes the node still shows offline. Usually stopping, rm’ing, and restarting the docker instance fixess this, but i need some sort of alert for it.

Try this:

#!/bin/bash

ports=`docker ps | tr "," "\n" | grep 14002 | cut -d"-" -f1 | cut -d":" -f2`

for port in $ports; do

lastPinged=`/usr/bin/curl -s "127.0.0.1:$port/api/sno/" | /usr/bin/jq ".lastPinged" | cut -d'"' -f2`
lastTS=`date --date "$lastPinged" +%s`
now=`date +%s`
lastContact=`expr $now - $lastTS`

echo $port: $lastContact
done

I cannot test it fully because my VM has only one node.
It should display port number and how many seconds since last contact.

1 Like

Why not using Uptime Robot?
It does the job pretty well. i’ve been using it for several months and it allowed me to handle downtimes quickly

1 Like

Thanks!

Thks is what I get:

bash check.sh

latest: 66497

“/entrypoint”: 66497

6: 66497

hours: 66497

ago: 66497

Up: 66497

6: 66497

hours: 66497

0.0.0.0: 66497

latest: 66497

“/entrypoint”: 66497

6: 66497

hours: 66497

ago: 66497

Up: 66497

6: 66497

hours: 66497

0.0.0.0: 66497

latest: 66497

“/entrypoint”: 66497

6: 66497

hours: 66497

ago: 66497

Up: 66497

6: 66497

Getting closer, i think those are the uptimes. So now i just need to find a way to echo the name of the docker instance that does not return an uptime (i.e. a node that is offline).

Please note that even though a node may show that it is up some hours, it may still not be online.

As stated before, this is a very simple check that ensures only that the machine is pingable. It does not check each node running within a network or on a certain machine.

This is interesting, probably some difference between your system and mine

OK, can you post the outputs of these commands:
docker ps

docker ps | tr "," "\n" | grep 14002 | cut -d"-" -f1 | cut -d":" -f2

1 Like

latest “/entrypoint” 57 minutes ago Up 40 minutes 0.0.0.0
latest “/entrypoint” 8 hours ago Up 39 minutes 0.0.0.0
latest “/entrypoint” 8 hours ago Up 39 minutes 0.0.0.0
latest “/entrypoint” 8 hours ago Up 39 minutes 0.0.0.0
latest “/entrypoint” 8 hours ago Up 39 minutes 0.0.0.0
latest “/entrypoint” 8 hours ago Up 40 minutes 0.0.0.0
latest “/entrypoint” 8 hours ago Up 39 minutes 0.0.0.0
latest “/entrypoint” 8 hours ago Up 39 minutes 0.0.0.0
latest “/entrypoint” 8 hours ago Up 39 minutes 0.0.0.0

And for example here is a full output of docker ps

ed47cf21a9d2 storjlabs/storagenode:latest “/entrypoint” 58 minutes ago Up 40 minutes 0.0.0.0:14960->14002/tcp, 0.0.0.0:23880->28967/tcp storj-q160

(well the first line of it at least)

OK, I know the problem now. I have debug port enabled so the port column has three values.

#!/bin/bash

ports=`docker ps | tr " " "\n" | grep 14002 | cut -d"-" -f1 | cut -d":" -f2`

for port in $ports; do

lastPinged=`/usr/bin/curl -s "127.0.0.1:$port/api/sno/" | /usr/bin/jq ".lastPinged" | cut -d'"' -f2`
lastTS=`date --date "$lastPinged" +%s`
now=`date +%s`
lastContact=`expr $now - $lastTS`

echo $port: $lastContact
done

one character was changed tr "," "\n" -> tr " " "\n"

1 Like

Great! That seems to be working.

Why am I getting such large values?

14960: 63738988884

Thanks alot p100!

I don’t know:
what does this give you?
/usr/bin/curl -s "127.0.0.1:14960/api/sno/" | /usr/bin/jq ".lastPinged"

“0001-01-01T00:00:00Z”

Its also very strange because just pinging the api like this, seems to have brought nodes that were offline, back online?!

The really long strings seem to correlate to offline nodes that display last contact as a long time ago (incorrectly) such as what I’m asking about here

The length of time is actually the time to 1:1:1970 as that was when the computer time starts, In my experience the last contact doesn’t display the correct time i disregard it, i usually take from it that the node is offline and nothing more