Issues with US Servers

Hello everyone,

I’m the only one who’s experiencing further downtime for US servers only, after the downtime issues of April 14th have been resolved?

I did experience further downtime period between yesterday and today, lost a 2% uptime rate for my whole nodes.

Can’t see any additional report on status.storj.io after the 14 of April, so I thought it would be better to see if the issue is affecting everyone or not.

Im not seeing any issues. What are you using for your DNS? Is it possible your having an issue similar to what people are experiencing with DuckDNS?

2 Likes

I actually use No-IP, however the downtime has been experienced only with US servers, for the rest of the servers all the nodes have 100% uptime.

I had the same issue, but just for one if my nodes, now it’s coming back slowly to 100%. I don’t know what it was.

You may use these scripts to figure out when your node was not available for the satellite:

You can then check your router/firewall logs for those dates to see if connections were blocked and possibly why.

You may also have messages like “ping satellite failed” in the node’s logs. In the last case please check that the DDNS updater is configured on your router (usually in DDNS section) instead of the app on your PC.

1 Like

Node 1

for item in `curl -sL http://localhost:14002/api/sno | jq '.satellites[].id' -r`; do
    curl -s http://localhost:14002/api/sno/satellite/$item | \
    jq '{id: .id, auditHistory: [.auditHistory.windows[] | select(.totalCount != .onlineCount)]}'
done

Returns the following:

{
  "id": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE",
  "auditHistory": []
}
{
  "id": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6",
  "auditHistory": []
}
{
  "id": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S",
  "auditHistory": [
    {
      "windowStart": "2025-04-13T12:00:00Z",
      "totalCount": 8,
      "onlineCount": 7
    },
    {
      "windowStart": "2025-04-19T00:00:00Z",
      "totalCount": 12,
      "onlineCount": 11
    },
    {
      "windowStart": "2025-04-21T00:00:00Z",
      "totalCount": 11,
      "onlineCount": 10
    }
  ]
}
{
  "id": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs",
  "auditHistory": []
}

Node 2

for item in `curl -sL http://localhost:14003/api/sno | jq '.satellites[].id' -r`; do
    curl -s http://localhost:14003/api/sno/satellite/$item | \
    jq '{id: .id, auditHistory: [.auditHistory.windows[] | select(.totalCount != .onlineCount)]}'
done

Returns the following:

jq: error (at <stdin>:1): Cannot iterate over null (null)
{
  "id": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6",
  "auditHistory": []
}
{
  "id": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S",
  "auditHistory": [
    {
      "windowStart": "2025-04-21T00:00:00Z",
      "totalCount": 7,
      "onlineCount": 6
    }
  ]
}
jq: error (at <stdin>:1): Cannot iterate over null (null)

Node 3

for item in `curl -sL http://localhost:14004/api/sno | jq '.satellites[].id' -r`; do
    curl -s http://localhost:14004/api/sno/satellite/$item | \
    jq '{id: .id, auditHistory: [.auditHistory.windows[] | select(.totalCount != .onlineCount)]}'
done

Returns the following:

jq: error (at <stdin>:1): Cannot iterate over null (null)
{
  "id": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6",
  "auditHistory": []
}
{
  "id": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S",
  "auditHistory": [
    {
      "windowStart": "2025-04-13T12:00:00Z",
      "totalCount": 5,
      "onlineCount": 4
    }
  ]
}
{
  "id": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs",
  "auditHistory": []
}

Node 4

for item in `curl -sL http://localhost:14005/api/sno | jq '.satellites[].id' -r`; do
    curl -s http://localhost:14005/api/sno/satellite/$item | \
    jq '{id: .id, auditHistory: [.auditHistory.windows[] | select(.totalCount != .onlineCount)]}'
done

Returns the following:

{
  "id": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE",
  "auditHistory": []
}
{
  "id": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6",
  "auditHistory": []
}
{
  "id": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S",
  "auditHistory": [
    {
      "windowStart": "2025-04-13T12:00:00Z",
      "totalCount": 4,
      "onlineCount": 3
    },
    {
      "windowStart": "2025-04-19T00:00:00Z",
      "totalCount": 9,
      "onlineCount": 8
    },
    {
      "windowStart": "2025-04-21T00:00:00Z",
      "totalCount": 12,
      "onlineCount": 11
    },
    {
      "windowStart": "2025-04-22T00:00:00Z",
      "totalCount": 4,
      "onlineCount": 3
    }
  ]
}
{
  "id": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs",
  "auditHistory": []
}

Node 5

for item in `curl -sL http://localhost:14006/api/sno | jq '.satellites[].id' -r`; do
    curl -s http://localhost:14006/api/sno/satellite/$item | \
    jq '{id: .id, auditHistory: [.auditHistory.windows[] | select(.totalCount != .onlineCount)]}'
done

Returns the following:

jq: error (at <stdin>:1): Cannot iterate over null (null)
{
  "id": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6",
  "auditHistory": []
}
{
  "id": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S",
  "auditHistory": [
    {
      "windowStart": "2025-04-19T00:00:00Z",
      "totalCount": 9,
      "onlineCount": 8
    }
  ]
}
{
  "id": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs",
  "auditHistory": []
}

Node 6

for item in `curl -sL http://localhost:14007/api/sno | jq '.satellites[].id' -r`; do
    curl -s http://localhost:14007/api/sno/satellite/$item | \
    jq '{id: .id, auditHistory: [.auditHistory.windows[] | select(.totalCount != .onlineCount)]}'
done

Returns the following:

{
  "id": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE",
  "auditHistory": []
}
{
  "id": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6",
  "auditHistory": []
}
{
  "id": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S",
  "auditHistory": [
    {
      "windowStart": "2025-04-19T00:00:00Z",
      "totalCount": 6,
      "onlineCount": 5
    }
  ]
}
{
  "id": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs",
  "auditHistory": []
}

You can then check your router/firewall logs for those dates to see if connections were blocked and possibly why.

I tried with sudo grep "Lun 21 00:00:00" /var/log/ufw.log and sudo grep "2025-04-21 00:00:00" /var/log/ufw.log or sudo grep "Lun 21 00:00" /var/log/ufw.log and sudo grep "2025-04-21 00:00" /var/log/ufw.log but no logs were printed.

So I tried with a more specific command such as: sudo grep "Lun 21 00:" /var/log/ufw.log | awk '{if ($1 == "Lun" && $2 == "21" && $3 >= "00:00" && $3 <= "00:15") print $0}' but again without success

You may also have messages like “ping satellite failed” in the node’s logs. In the last case please check that the DDNS updater is configured on your router (usually in DDNS section) instead of the app on your PC.

I tried with sudo docker logs <Node Name 1-2-3-4-5-6> | grep "Ping satellite failed" but again no logs were printed.

Are we sure that the issue isn’t directly related to the servers themselves?
All the other servers never experienced any downtime and all the nodes still have 100% uptime.




PS: Also, does anyone know why my 3rd node has been already updated to 1.126.2 more than 3 days ago meanwhile the others nodes are still running 1.125.2?
Watchtower is running correctly and apparently working.

The error is “ping satellite failed”, so it’s case-sensetive, or you need to use grep -i.

Yes, otherwise there would be a registered issue on status.storj.io.

Seems something is periodically blocking connections from the auditors/repair workers of us1 satellite. Very similar to surricata or similar DDoS protection software.

It’s also interesting to see, that the script did filter other satellites, but there is no records.
Thus I can assume that your nodes are new (younger than a month), they also have had a little amount of audits, so they store a little amount of data, so the online score will be very sensitive until they would become older.

Tried again with sudo docker logs <Node Name 1-2-3-4-5-6> | grep "ping satellite failed" (fixed case-sensitive) but same result, only empty outputs.

So I’ve checked Suricata rules and found out that there were sid: 2058007 and few others that were unintentionally blocking storj service and marking it as a false positive.

I deleted the aforementioned rules and added these below to a new /etc/suricata/rules/local.rules file:

drop dns $HOME_NET any -> any any (msg:"Whitelist Storj DNS"; dns.query; content:"storj.io"; nocase; sid:1000001; rev:1;)
drop tls $HOME_NET any -> $EXTERNAL_NET 443 (msg:"Whitelist Storj TLS"; tls.sni; content:"storj.io"; nocase; sid:1000002; rev:1;)
pass ip any any -> any [28967:28972] (msg:"Ignoring custom service ports 28967-28972 (TCP/UDP)"; sid:1000003; rev:1;)
pass ip any [28967:28972] -> any any (msg:"Ignoring custom service ports 28967-28972 (TCP/UDP)"; sid:1000004; rev:1;)

Rebooted Suricata and voilà, no more false positive logs, let’s hope the issue won’t happen ever again.
I’ll keep updated, thanks @Alexey!

1 Like

You are welcome!
I would suggest to disable it for storagenode completely, unless you want to be alerted for offline or even disqualified.
It also likely will significantly reduce your payout, because the customers (basically - any random address and port with random access pattern) will be blocked from their data.

Thanks for the tip Alexey, I uninstalled Suricata completely, will stick with Crowdsec instead :handshake:

By the way, is there any update regarding the delayed v1.126.2 for the rest of my nodes?
Is anyone experiencing the same issue? (only 1 out of 6 nodes has been updated to 1.126.2, more than 5 days ago but the rest of them is still on v1.125.2)

uh.. the problem is not in Suricata.. The problem is in any blockers, you cannot normally update it to support the p2p activity which is produced by the node…
I would suggest to remove any such tools or add the node (or its host) to a full exceptions list, otherwise your node is doomed. The only a usual firewall is enough.

There is no issues, it will be updated when it’s allowed, we do not want to shutdown the whole network when the new version become available, they updates when their NodeID is in the current cursor on the https://version.storj.io. So, please, just wait.

1 Like