Of course it is stuck because the node is not running. Just check the timestamps.
These are the last log lines these nodes are currently showing.
I have the same log entries and the node is running. They tell you that: it downloads the version number (that formulation is somwhaow missleading); your version for storagenode and storagenode-updater; the new version is rolled out for both services but the pointer dosenât reached your node id. It dosenât tell you that the updated started for your node.
And it seems that your nodes are checking for new versions once per day. I remember that default was 1h. I set mine to 6h.
Thatâs great for you. These nodes are not running.
Something else is wrong than.
What the dashboard says?
What log options do you use?
Even with log.level on fatal, these entries are still logged.
Yes and I am telling you again that these nodes are not running.
What is so hard about that to understand?
These are the ends of the current logs. No activity after that, no more log lines.
If you compare, these are basically the same messages and the same situation that OP has reported.
Next time that happens, can you see if the inbound port is still open?
Most log entries are from inbound connections asking for something: and if inbound port-forwarding stops working⌠my guess is you may be left with those sparse log entries (from the few automated tasks a node decides to do for itself: like look for upgrades). So maybe when youâre restarting your nodes youâre also restarting a VPN connection?
Iâm guessingâŚ
I am sure a restart will help.
But first I must fix something different, then I will try it.
I canât suggest anything based on those log entries, because they donât tell anything is wrong. And you donât provide anything else. I donât know your setup or config parameters. I canât do the debug for you.
Reading above I see that someone found a database lock entrie. So there was the database problem.
Here, I just see normal log entries and the fact that you say nodes donât work.
If Windows dosenât log things like in linux, maybe the service realy gets stuck without any info to work with.
Try what others suggest, restart the machine, services, VPN, etc. and modify the parameters for storagenode service to try more than 1 time if dosenât start. I know I had problems when I ran a win node 3 years ago, with service start, and modified the start/restart options like: try 5 restarts, 1 min apart or something.
For database lock and in general, moving databases on a SSD or USB 3 SSD stick helps everything.
Also Windows has a very complex logging system. See if you get some warnings or errors there, in computer management I believe.
It is exactly the situation that OP has described: The node gets stuck and the last thing it logs are the log lines he resp. I have posted.
There is nothing else after that. Docker is saying the contaainer is still running. But it does not do anything.
It is not the first time this has happened and usually a restart brings the container back to a working state.
Since this problem seems to be related to auto update it might help setting AUTO_UPDATE to false.
How is this related to auto-update? No line says âupdate startedâ or something.
What log level do you run? Any custom level settings? I try to figure out if your log level bypasses the important entries.
Last line in log is coming from storagenode-updater and next missing line would be from storagenode-updater, so how could this be not related to auto-update?
I donât think this is a bug of the updater. This quite sure is a stopped node. Probably somewhere before in the log you got a timeout error. But in one or another way, the storagenode doesnât restart always. The only thing working now is the updater, which is being restarted every now and then, as in the config.
Have had them on Linux as well. Had to do with slow file system.
That is a great analysis of the situation. That it is not the updater that causes a problem itâs rather the only thing left that is still running and still logging stuff.
Ok then, the question would be, why did the storagenode not restart. But that is another story then.
Stopped asking, because I got unreadable logs with timeout error some hours before this happened. See one of my latest topics.
Has probably to do with slow filesystem. Therefore Iâm migrating to ZFS, in order to split metadata from the real data in order to speed up a few things.
The question whether the node doesnât restart after timeout⌠Well⌠I donât know. Actually donât care either. Because if they did, they would be restarted at least one time a day. Which would also break ever filewalker. So, this problem needs to be solved at my side instead of STORJs side. Maybe the choice-of-best-n is going to solve some things.
These logs doesnât have entries from the storagenode
process at all, this is usually mean that you redirected logs to the file. In this case the docker logs
command will show logs only from the supervisor and storagenode-updater
, but storagenode
logs will be in the file.
So, please check your redirected file for errors why your node is crashed.
However, I would prefer to know, why itâs not restarted afterâŚ
storagenode2-storagenode-1 | 2024-06-25T07:45:32Z INFO New version is being rolled out but hasn't made it to this node yet {"Process": "storagenode-updater", "Service": "storagenode-updater"}
I keep getting this error on one of my two nodes. This makes the dashboard go down and the entire service go down. I donât run watchtower as it didnât work for some reason. How can I fix this?
Do you have something in the storagenode
logs?
If you redirected logs to the file, you need to check that file instead of docker logs storagenode
.