Troubleshooting a node

BrightSilence · September 28, 2020, 11:12am

You say your forwards are correct, but did you actually verify this using https://www.yougetsignal.com/tools/open-ports/?port=28967 ?

If so, please restart the node, wait 30 seconds and post the logs since restart. That should hopefully contain a little more useful information.

jammerdan · September 28, 2020, 11:12am

I don’t know. The Pastebin log does not look complete.
Maybe post log from node startup. You can skip anything concerning satellite “118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW”.

For how long did you not receive data from the other satellites?

jeremyfritzen · September 28, 2020, 11:14am

Ok. So could you give your node version? Or a screenshot of the CLI dashboard?

joesmoe · September 28, 2020, 11:18am

Yes i have tried with yougetsignal. And i have also stopped, rm’d, and then restarted the node multiple times over the past days (maybe 3 or 4 times in total).

joesmoe · September 28, 2020, 11:18am

The node has been online approx 3 days.

The log is complete.

joesmoe · September 28, 2020, 11:19am

Surely, here is the CLI dashboard.

Screen Shot 2020-09-28 at 1.19.11 PM

joesmoe · September 28, 2020, 11:19am

Node version is the latest that i can easily pull from docker using the :latest tag.

jeremyfritzen · September 28, 2020, 11:20am

The dashboard says your node is offline.

As suggested by others, could you restart your node right now (even if you already did) and get the logs again?

jammerdan · September 28, 2020, 11:21am

The log does not show the startup sequence. So something is missing.

You dashboard says the node is offline. So maybe something is wrong with the identity, the mountpoint or the databases.

nerdatwork · September 28, 2020, 11:21am

Can you check you aren’t using curly quotes ?

joesmoe · September 28, 2020, 11:23am

They are not curly quotes, the forum did that for me.

I have many nodes running on the same machine (1 hdd per node per TOS). They all use -very- similar commands.

joesmoe · September 28, 2020, 11:23am

I truly beleive this to be the full log. I restarted the node AFTER i took the dump of the log. I will restart the node and dump the log again, ignore the loggin stuff, its sending all my nodes logs to one centralized sql database for simplicity take. The logging system is infact working fine.

joesmoe · September 28, 2020, 11:26am

Okay i docker stop -t 300
then docker rm

Then ran the same docker command as listed here before.

Then i dumped the logs again, and used a different pastebin incase that was messing up the logs.

BrightSilence · September 28, 2020, 11:26am

Your node shows an external address without an IP or hostname. Which means it’s for some reason not getting the parameters from the run command. I suggest copying the run command from the documentation again and filling in the values. Keep it as a multiline command in a shell script so you can easily start the node again should the need arise.

joesmoe · September 28, 2020, 11:26am

I don’t use the multiline, and it can listen on all IPs on that external port and forward it to the internal port 28967.

joesmoe · September 28, 2020, 11:27am

I see what you mean, on working nodes, its actually showing the full external IP. On this node it is not.

Let me figure out why.

joesmoe · September 28, 2020, 11:28am

Also on around 16 nodes i setup for new hard drives a few days ago - i’m seeing this error on 3 of them. All the docker run commands are identical except for 6 unique values (mainly IP, ports, identity and the storage files).

So why do some of them error and not the others. That’s the part i’m confused on.

BrightSilence · September 28, 2020, 11:29am

Your timestamps are in complete random order and the restart is not part of this log. I’m not sure what you’re doing to get these logs, but these seem like just random log lines in random order and they don’t include the restart.

I noticed, I’m saying please do use the multiline. The closer you stick to the documentation, the less chance of an error sneaking in.

Additionally, did you make any changes to the config.yaml?

joesmoe · September 28, 2020, 11:29am

It seems like its caching this value in the config.yml even after i docker rm and docker image prune -a or whatever.

joesmoe · September 28, 2020, 11:30am

Never changed config.yaml.

I use graylog to aggregate the logs. Let me figure out why its spitting out stuff in completely random order, that’s not how i see it in the web interfface.