Node Online status

marksorder · June 30, 2021, 4:43pm

Noticed one of my nodes online dropping in spite of showing status online. How do I check to see what is causing it? My other two are at 100% and also showing as online.

I do have the ports forwarded in router and the configuration setup on the server for the port. Just rebooted server as well to see if that corrects.

Alexey · June 30, 2021, 5:43pm

Your node were offline. You can see history of audit requests:

marksorder · June 30, 2021, 5:57pm

What would have caused them to report as offline? Certainly no configuration issues as all 3 are not reporting the same - the other two are fine - as noted.

What should I be looking for in that url?

Alexey · June 30, 2021, 7:04pm

Time and dates. If you would use browser, then open it in the FireFox, it will generate a pretty view for you.
Or you can use curl and jq in bash
Here is example for PowerShell:

For bash:

curl http://localhost:14002/api/sno/satellite/12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs | jq .auditHistory.windows

marksorder · June 30, 2021, 8:36pm

Ok. I see time and dates. Again, why would it report as offline when it was not?

Alexey · June 30, 2021, 8:53pm

Because your node didn’t answer on audit. Search for dates, when total number audits don’t match online audits.
In these dates and time your node was not available from outside.
It could be your ISP, or dynamic IP has changed but your DDNS hostname was not updated in time.
I would suggest to use Uptimerobot.com for monitoring.

marksorder · July 4, 2021, 2:04pm

This is my point - there is no indication they did not match.

If it was, as you say, the ISP or change in a dynamic IP causing a DDNS hostname not updating - it would be all the servers experiencing the same thing, which has not happened.

That said, what is the detriment to the online scores(which have not yet changed for some of the satellites since reporting) resultant in?

Stob · July 5, 2021, 8:45am

The online percentage is a rolling 30 day window, so it can take up to 30 days for the outage/downtime to no longer be reflected in the figures.

Make sure the percentages don’t continue to drop, otherwise you may have a problem which needs fixing!

marksorder · July 5, 2021, 4:28pm

Clearly there is a problem that needs fixing - as of today, some are lower - so clearly there is an issue with something and the storj connection.

Certainly wish someone from storj would assist with resolving the issue.

Stob · July 5, 2021, 4:45pm

You don’t appear to have provided any log files or audithistory scores. Those would help StorJ employees and other members of the forum to assist in diagnosing further.

Edit - Using my own node as an example the audithistory shows an issue on the 16th June 2021:

With that knowledge I can then go to the full log file and check for ERROR or FATAL:

The log file quite clearly shows a problem. In my case I found out my router stopped port forwarding, so I had to reboot the router.

Alexey · July 5, 2021, 7:57pm

Could you please provide results of audit checks?

marksorder · July 7, 2021, 10:44am

Am I supposed to have done so? Did not see any request for that, unless I missed it.

marksorder · July 7, 2021, 10:45am

Audits across all servers show 100% (under Suspension & Audit), as reflected in the original post attachment.

Alexey · July 7, 2021, 7:48pm

Please, provide a list of audit checks, returned by the command for the satellite in question.

marksorder · July 8, 2021, 7:24pm

It is like pulling teeth with storj - half a dozen posts and still at step 1 of the concern. I’ve seen a substantial amount of people frustrated with storj and the whole process - but now that I am experiencing it, I understand why people leave all the time.

Again, as reflected in the original post attachment.

marksorder · July 8, 2021, 7:25pm

It is like pulling teeth with storj - half a dozen posts and still at step 1 of the concern. I’ve seen a substantial amount of people frustrated with storj and the whole process - but now that I am experiencing it, I understand why people leave all the time.

Again, as reflected in the original post attachment.

Alexey · July 8, 2021, 7:43pm

Please, execute the command on your PC (I do not know what is your OS) and post results:

PowerShell

((curl http://localhost:14002/api/sno/satellite/12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs).Content | ConvertFrom-Json).auditHistory.windows | where{$_.totalCount -ne $_.onlineCount}

bash

curl http://localhost:14002/api/sno/satellite/12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs | jq '.auditHistory.windows | select(.totalCount != .onlineCount)'

Longer you waiting - more data would disappear, because it’s 30 days rolling window
Added a filter to show only a difference
In these dates your node was not available at full.

Alexey · July 8, 2021, 8:49pm

Your node has all needed data. It was not available for the external services. And you can request this data to see when it’s happened. You can also configure an uptimerobot.com monitoring to have a nice GUI.

In Windows, could you please open a PowerShell window and paste there a command, then copy results?
The cmd.exe will not work.

Sorry, I do not know what OS are you running now, you didn’t tell that in this thread, so I’m forced to guess.

The command will show you when your node was offline, but not why.
It can be your ISP, router, PC, power outage, OS upgrade, BSOD or kernel panic - anything.
The reason impossible to figure out only with storagenode. You need to check all your infrastructure. The storagenode can only show when the problem was.

The storagenode stores detailed data for audit checks in databases. And it has rolling window.
For today the earliest record would be 2021-06-08T21:07:00Z. So all records before that already gone.

soysaws · July 9, 2021, 4:52am

Does it mean you went offline if the uptime tracker near the top of your dashboard resets?

Alexey · July 9, 2021, 8:10am

No. This is reset only when your service was restarted (update, reboot, power failure, etc.).
However, in case of docker for Windows with Hyper-V or docker for Mac the reboot of the host may not affect uptime, because these versions of docker uses a VM, and the hypervisor can save the state of the VM across reboots, so docker won’t actually restart and thus all running containers will continue to run.