QUIC Misconfigured in v1.67.1

I guess now is a good time to explain how the QUIC configuration check works starting from the release of v1.67.1.

We reimplemented how the QUIC check is done and that was introduced in this commit: storagenode: overhaul QUIC check implementation · storj/storj@59b37db · GitHub.
The commit message explains at high level, what was wrong and what we did to fix this.

The current implementation blocks the the startup until one or none
of the trusted satellites is able to reach the node via QUIC.
This can cause delayed startup. Also, the quic check is done
once during startup, and if there is a misconfiguration later,
snos would have to restart to node.

In this change, we reuse the contact service which pings the satellite
periodically for node checkin. During checkin the satellite tries
pinging the node back via both TCP and QUIC and reports both statuses.
WIth this, we are able to get a periodic update of the QUIC status
without restarting the node.

To be clear, it makes sense that a hard refresh changes the QUIC status. You don’t need to restart the node unlike before this change was introduced. We are reusing the contact service which periodically sends a checkin request to the satellite. The satellite then tries to ping back the node via both TCP and QUIC. If any of the satellites fails to ping back the node using QUIC, the dashboard will show the misconfigured status message which could also mean that your node was offline or checkin failed and will be retried or the UDP service port is actually misconfigured.

3 Likes