Storj API - Checking if node is online

Eleos · March 25, 2024, 7:57pm

Is there a way to check if the node is online or not via the API? As far as I can tell, checking the last pinged within a week would not be reliable.

JWvdV · March 25, 2024, 8:08pm

No, a week is quite long. But what’s wrong to check whether last ping is no more than an hour or something ago?

Eleos · March 25, 2024, 8:15pm

The issue I am wondering about is that is the last pinged value a real indication for if a node is online or not?

JWvdV · March 25, 2024, 8:17pm

As far as I can tell: yes, it is. If there’s no contact at all, the date is defaulted to epoch or something (in each case: ages ago).

Alexey · March 26, 2024, 3:54am

The API is not a good check, you need to check availability from outside of your network, like UptimeRobot or Kuma.
You may also use a script similar too:

it will detect dates when your node missed audits, but’s not a real time monitoring.

no, but the difference - could be. This is behavior of the dashboard, the API will return a last successful contact date and then you may decide how long ago it was. But you likely want to see it updated at least once in a few hours.

Eleos · March 26, 2024, 4:21am

I already have uptimerobot running for my node. I’m just making a bash script node monitor for certain info that runs ever hour so I don’t have to keep going to the dashboard and shows some info that the dashboard doesn’t show at a glance.

nerdatwork · March 26, 2024, 5:04am

Use Uptimerobot’s API to check online status of your node.

JWvdV · March 26, 2024, 6:21am

Can you explain why? I’ve never seen an online node with last ping more than some minutes ago.

I’ve also never seen an offline node keeping low ping times.

Besides, this is a quite cumbersome way. As in the API there is as far I know not a means to find out the external address of the node. This means, you have to set this all up by hand; with static IPs or DDNS (which may give false alarm, if the update is too slow it the robot isn’t clearing DNS caches often enough).

Alexey · March 26, 2024, 7:34am

Because your node could be available for online checks from the satellites but not for the customers (a wrong firewall rule, problems with routing, blocked IP, etc.).
Maybe the API status could be used for almost all cases, but not all.

Eleos · March 26, 2024, 7:38am

So it’s best to check via the DNS and port via an outside source/service in this case? Like uptimerobot like you said.

Not even checking by pinging the DNS+port from the node device itself via a script?

Alexey · March 26, 2024, 7:56am

This could show only that the interface is working, to check the node you need to get a response via protocol, at least like using GET for http://external.address.tld:28967, it will give you a json response. You may open it in your browser.

However, it’s useful only if you open it from outside of your network. Otherwise you may use the API response as a workaround and lean only on connectivity between the node and the satellites.

Eleos · March 26, 2024, 7:58am

Would this be ran on the node device or off the node device? Would it be off the home network too?

Alexey · March 26, 2024, 8:01am

if either the device or the node is off, it will timeout. However, checking from the same network not so effective:

you may not have an external connectivity, but your node would respond on internal requests
not all routers supports a Hairpin NAT (ability to route a traffic to the internal network even if the request is come from the internal network using the external interface).

If you limited only to usage of the same network, then you likely would use the API response, however it’s not 100% guarantee that node is actually serve the customers, as explained above.

JWvdV · March 26, 2024, 5:21pm

The fact, the node if available for online checks tells essentially:

The external address is apparently resolvable (so, no DNS issues or something).
The port is open (so no routing issues).

The fact you can get the time, on which this was checked to be truth the last time, by using the API:

Aforementioned facts can only have become false in the time between, which is less likely as long as this moment is more recent.
There’s a running storagenode service (implying a readable / writable storage and such, at least for the timeout used for this).

So, technically, I don’t see how your statement can be true. Firewall issues are an special category here. For which even checking from the outside doesn’t guarantee anything, for example of case of region-based blocking and so on. Blocked outside IPs can be very selective or even time/interval-based, like DDOS-prevention.

So essentially, is your statement really true?
So you have any topic in which there was a recent ping, but the node actually was offline?

JWvdV · March 26, 2024, 5:27pm

Ad 1.
Will only be true, if you’re not using the external IP-address / domain. Because usually your router is ‘aware’ of having no connectivity. And most routers -especially customer routers- have IP-addresses assigned by their ISP instead of a fixed one. Of which the latter is the only situation I can think of, this statement could be truth.

Ad 2.
Isn’t a problem at all, because the request is sent up to your ISP in case of no hairpin-mode of your own router and then reflected back. Considering your first ‘fear’, it would be even a benefit if there was no hairpin-mode.

Alexey · March 27, 2024, 7:40am

You may use the external domain, but if you are behind the CGNAT, it can be updated to the WAN IP (but not the external one). The other way how to broke it - use two routers before you. With a high probability the DDNS will be updated to the local IP of the internet-facing router and will be successfully routed back to the network.
However, I believe we need to test that. I hope that the lastPinged is actually mean pinged by a satellite

…and blocked by a router because of conflict in the NAT route table… As a result - no response at all will be routed to the requested client, I saw this on some consumer routers.

JWvdV · March 27, 2024, 11:55am

Usually, there is a DDNS-server like DuckDNS or in my case asuscomm.com. Periodically, the router connects to the server. So, that server only sees the most outer address and updates the DNS-record as soon as it changes.

So, the router doesn’t update the DNS record itself but the dynamic DNS-service does.

The fact this probably won’t work, is when no port forwarding or DMZ has been configured.

So, behind CG-NAT or being behind multiple routers, is exactly the same thing. In both cases DNS is updated to the most external address, but the forwarding doesn’t happen anyhow.

In all these cases, pinging will not succeed either.

For sure, you can check it if you block port forwarding. In that case the node still can ping/reach the satellite. But the satellite can’t reach the node. So ping errors will show up in your error logs.

arrogantrabbit · March 27, 2024, 5:30pm

Both are horrible. I’d suggest picking literally any other. Ideally – cloudflare. Google Domain used to be good too, but was subsequently sold to Squarespace. You want to stick with large, well established platform, that know what they are doing. DuckDNS is literally two students in a dorm room.

Right. And the node won’t work either. For these cases you can connect the node via another server, that is publicly available.

On topic – the best was to check the node’s health is via external service that will check “AllHealthy” record that the node reports on the advertised address and port.

True – great. Not true – investigate. Kuma, already suggested above, does it perfectly.

Still does not guarantee that you are not blocking out part of the internet – you may want to run Kuma on a variety of cloud services across the world if you prefer. But that would be an overkill.

JWvdV · March 27, 2024, 6:12pm

For what’s worth it: it’s free, it’s running stable on my part for over three years. Don’t see the horrible part here. Apparently two students in a dormant room, seem to do a good job. Like once that big company started in a garage.

That’s exactly what the defined problem is: if routing is a problem, than it won’t work for any network traffic relying on the routing. So a successful ping initiatief by the storagenode itself, might be a reliable proxy.

Indeed you can use a VPN or something (work and/or costs involved). Or just don’t put multiple NATs behind each other, whether being CG-NAT, multiple routers, or any other combination (most simple solution, often feasible). Or make them properly routing the traffic (work involved).

It wasn’t the question whether it was the best way, which may be. But costly in terms of hand labour, especially if you’re running many nodes. You also need an external device, checking your nodes.

The question however was whether the API was usable for this purpose. And I’m very inclined to say yes, based on latest ping time. Other way around could be checking logs on successfully finished up- and downloads; processing the time of these log lines.

And there might be more ways to Rome.

arrogantrabbit · March 27, 2024, 6:26pm

You got lucky: Search results for 'duckdns' - Storj Community Forum (official). This also aligns with my personal experience with the service. BTW, Cloudflare is also free. (as was Google Domains)

Right. But they did not produce an iPhone out of that garage. They almost went bankrupt.

It’s not under your control. Going forward, more and more customers will be being NAT. It makes no sense to give everyone routable IP. And not all providers will support adding ports.

The amount of effort does not scale with number of nodes. You setup it once, and monitor all the nodes. This has benefit of actually monitoring external connectivity – the thing that matters.

It just seems much harder to do. You need to give your monitor access to internal node API, for every node. On the other hand, external API is open by definition – otherwise the node won’t work.

Sure. I’ve read the whole topic twice, and I still don’t think I grasp the point. It seems monitoring external endpoint is easier and more reliable.