Troubleshooting a node

You say your forwards are correct, but did you actually verify this using https://www.yougetsignal.com/tools/open-ports/?port=28967 ?

If so, please restart the node, wait 30 seconds and post the logs since restart. That should hopefully contain a little more useful information.

I don’t know. The Pastebin log does not look complete.
Maybe post log from node startup. You can skip anything concerning satellite “118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW”.

For how long did you not receive data from the other satellites?

Ok. So could you give your node version? Or a screenshot of the CLI dashboard?

Yes i have tried with yougetsignal. And i have also stopped, rm’d, and then restarted the node multiple times over the past days (maybe 3 or 4 times in total).

The node has been online approx 3 days.

The log is complete.

Surely, here is the CLI dashboard.

Screen Shot 2020-09-28 at 1.19.11 PM

Node version is the latest that i can easily pull from docker using the :latest tag.

The dashboard says your node is offline.

As suggested by others, could you restart your node right now (even if you already did) and get the logs again?

1 Like

The log does not show the startup sequence. So something is missing.

You dashboard says the node is offline. So maybe something is wrong with the identity, the mountpoint or the databases.

Can you check you aren’t using curly quotes ?

They are not curly quotes, the forum did that for me.

I have many nodes running on the same machine (1 hdd per node per TOS). They all use -very- similar commands.

I truly beleive this to be the full log. I restarted the node AFTER i took the dump of the log. I will restart the node and dump the log again, ignore the loggin stuff, its sending all my nodes logs to one centralized sql database for simplicity take. The logging system is infact working fine.

Okay i docker stop -t 300
then docker rm

Then ran the same docker command as listed here before.

Then i dumped the logs again, and used a different pastebin incase that was messing up the logs.

Your node shows an external address without an IP or hostname. Which means it’s for some reason not getting the parameters from the run command. I suggest copying the run command from the documentation again and filling in the values. Keep it as a multiline command in a shell script so you can easily start the node again should the need arise.

1 Like

I don’t use the multiline, and it can listen on all IPs on that external port and forward it to the internal port 28967.

I see what you mean, on working nodes, its actually showing the full external IP. On this node it is not.

Let me figure out why.

Also on around 16 nodes i setup for new hard drives a few days ago - i’m seeing this error on 3 of them. All the docker run commands are identical except for 6 unique values (mainly IP, ports, identity and the storage files).

So why do some of them error and not the others. That’s the part i’m confused on.

Your timestamps are in complete random order and the restart is not part of this log. I’m not sure what you’re doing to get these logs, but these seem like just random log lines in random order and they don’t include the restart.

I noticed, I’m saying please do use the multiline. The closer you stick to the documentation, the less chance of an error sneaking in.

Additionally, did you make any changes to the config.yaml?

1 Like

It seems like its caching this value in the config.yml even after i docker rm and docker image prune -a or whatever.

Never changed config.yaml.

I use graylog to aggregate the logs. Let me figure out why its spitting out stuff in completely random order, that’s not how i see it in the web interfface.