Windows GUI node started crashing

Twice in the last few days my node has suddenly gone offline, and I found that the Windows service was stopped. Here are the relevant log entries from the last crash (I have anonymized my domain name):

2020-07-24T06:13:47.199+0200	ERROR	Invalid configuration.	{"error": "invalid contact.external-address: lookup \"my-address.no-ip.biz\" failed: lookup my-address.no-ip.biz: no such host", "errorVerbose": "invalid contact.external-address: lookup \"my-address.no-ip.biz\" failed: lookup my-address.no-ip.biz: no such host\n\tstorj.io/storj/storagenode.(*Config).Verify:149\n\tmain.cmdRun:142\n\tstorj.io/private/process.cleanup.func1.4:359\n\tstorj.io/private/process.cleanup.func1:377\n\tgithub.com/spf13/cobra.(*Command).execute:840\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:945\n\tgithub.com/spf13/cobra.(*Command).Execute:885\n\tstorj.io/private/process.ExecWithCustomConfig:88\n\tstorj.io/private/process.Exec:65\n\tmain.(*service).Execute.func1:66\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2020-07-24T06:13:47.200+0200	FATAL	Unrecoverable error	{"error": "invalid contact.external-address: lookup \"my-address.no-ip.biz\" failed: lookup my-address.no-ip.biz: no such host", "errorVerbose": "invalid contact.external-address: lookup \"my-address.no-ip.biz\" failed: lookup my-address.no-ip.biz: no such host\n\tstorj.io/storj/storagenode.(*Config).Verify:149\n\tmain.cmdRun:142\n\tstorj.io/private/process.cleanup.func1.4:359\n\tstorj.io/private/process.cleanup.func1:377\n\tgithub.com/spf13/cobra.(*Command).execute:840\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:945\n\tgithub.com/spf13/cobra.(*Command).Execute:885\n\tstorj.io/private/process.ExecWithCustomConfig:88\n\tstorj.io/private/process.Exec:65\n\tmain.(*service).Execute.func1:66\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}

I have no idea why this happens in the first place, my first guess is that it’s just my IP changing, but with a dynamic DNS that ought to be a normal event. Perhaps it’s something wrong with no-ip.biz instead.

But regardless of the cause, when it does happen, having the service crash and burn does not seem like the best reaction.

The first time it happened, the node crashed while I slept, and it took 4 hours before I saw that the node was down. This time, the node crashed while I slept, and it took 3 hours before I saw it. If downtime could disqualify, I would be out of the game by now.

Check the no ip hostname if its working properly. Your hostname needs to be reactivated after every 30 days else it gets deactivated. You can get a list of DDNS from this thread

The node worked as soon as I manually restarted the service. I did not need to renew my no-ip address. Still, I will check out other DDNS alternatives.

Regardless of DDNS service used, it would be nice if the node would handle an invalid DNS address in a more graceful way than just crapping out.

If the DNS is down for one minute, the node goes down for eternity, until manually restarted (or perhaps until restarted when Windows update reboots the computer automatically). I don’t see why that is necessary.

The node could wait for the DNS to become valid again, instead of waiting for the SNO to 1) notice that the service is down and 2) manually get it running again.

You can enable the automatic restart of failed service in the properties of that service.

2 Likes

Thank you, that’s very useful to know. I have now set the service to restart after 5 minutes indefinitely. I figure that ought to do the trick.

1 Like