Sudden spike in suspended Nodes

Hi,

I wanted to do a quick follow up to explain the full extent of the problem and how it was fixed:

Short Version:
DNS did not respond correctly at least once and Routers DOS Protection kicked in.
Asus DDNS Service was unstable causing occasional added problems and had to be switched to noip.

Longer Version:
2 Problems could be identified.

  1. The log showed that the DNS did not answer at least on one occasion
  2. The Router itself had a activated DOS protection.

It seems that the second Problem was the main cause of missing the online checks. The access pattern of StorJ in conjunction with my other services in my network caused my Router to hold back packages and only answer with extremely high delays. I could detect Uptime robot pings of sometimes 15 Seconds. The delay could get so big that the online checks failed.

The problem was hard to identify as my Router (asus ax56u) had 2 settings related to DOS protection. A AI DOS Protection that I identified earlier and disabled. The Second setting was located in the firewall tab directly and was separate. At the same time the Router was not overloaded and had CPU usage of 1-5% and ram also had capacity. As such I didn’t see the blame on the router itself initially. After disabling the second DOS Protection setting the uptime robot delays dropped immediately and are now at 100-200ms instead of 5000-15000. Access to other services like Next cloud also improved drastically.

After this change I haven’t missed any uptime checks yet and all log entries related to “Service Ping Satellite Failed” have not appeared anymore. As such I will not immediately switch my DNS provider yet (asus) but will rather have both Noip and Asus running parallel and just monitoring the DNS querries in a constant loop looking for problems. (Powershell script that does a NS Lookup on both Noip and Asus DDNS as a loop every few seconds and logs failures). If NoIP seems more stable than Asus I will update this post and switch to it. Otherwise I will stay with Asus.
Edit:-> for 17 days Asus ran stable and did not miss a online check but then today for around 2 hours Asus just died and noip ran stable. So I have now switched all my homeserver stuff including Storj to noip.

6 Likes