My Storage Node is still trying to contact the 118U (Stefan B) satellite

Sasha · October 4, 2020, 7:33am

This is filling up the logs unnecessarily.

Alexey · October 4, 2020, 9:06am

It’s shutdown. Those errors should gone, when that satellite will be removed from trusted list.
This change requires a fix in the storagenode’s code.
I have no ETA when it will be fixed.
You can ignore those errors at the moment

Toyoo · October 4, 2020, 10:51am

I hope the satellite will be removed from the trusted list soon. As it is now, it teaches SNOs a bad habit of ignoring ERRORs.

deathlessdd · October 4, 2020, 1:22pm

This is just an opinion we have been ignoring certain errors since the beginning of being a SNO. There’s worse errors you do need to worry about though. Like Audit errors and missing file errors, A simple satellite being offline after being told its shutdown shouldn’t make you worry anymore.

SGC · October 4, 2020, 6:38pm

i like how you think… still doesn’t keep my color log from flashing red once in a while…
but been learning to ignore that too… lol

Toyoo · October 4, 2020, 7:53pm

Yeah. Marking log entries as ERROR is now useless, because a lot of entries are not actually errors.

stefanbenten · October 4, 2020, 8:58pm

There is no way to make it not an error, besides making a code exception. That sounds like a more awful thing causing issues in the future, then waiting for the next release which should contain the fix.
After that is live, we can safely untrust.

Sasha · October 5, 2020, 2:29am

If you check the timestamp in the screenshot this error in the log happens 3 to 4 times a minute. That is a lot of errors in the log file.

@Alexey, hopefully this is addressed quickly.

stefanbenten · October 5, 2020, 6:26am

@Sasha do Not get me wrong, I operate nodes myself and find the error message not great. There is simply a balance between we need to fix this immediately and it’s ok to roll it out with the next release, that we need to keep in mind.

If I am not mistaken, the release is started today and thus by end of the week, your node might have the fix

SGC · October 5, 2020, 7:20am

i agree… it’s certainly not a real issue, but it doesn’t look pretty and nor am i convinced it would be considered an error for the node…

if seen from a nodes perspective it would make it question the internet connection maybe… ofc that looks fine, but still it would be mechanisms like that which eventually would be able to error correct and problem solve, for all the minor rookie issues… like say somebody set a wrong ip in the run command, but “127.0.0.1” works for the system, so it just keep using that until the configuration is actually good enough that it will work without reverting to hard default’s

not exactly problem solving, but for a rookie it might be…

Toyoo · October 5, 2020, 1:09pm

On the contrary, this is an issue to any serious sysadmin. Clarity of error messages is a reliability issue, because it should be as easy as possible to set up alerting with no false alerts and no missed alerts. Usually it’s as simple as considering all messages above certain logging level as alerts. This is unfortunately not the case with storage nodes.

SGC · October 5, 2020, 5:36pm

i fully agree with that…

however i also understand that the storj team will have a list of priorities, and making sure what is essentially sorting of spelling or wording in log files, well it might not really be very high on the list of priorities…

but it’s open source… you could fix it and submit it

Toyoo · October 7, 2020, 7:11pm

Sure. Though, I’d consider this a regression. In the place I work, regressions are usually high-priority bugs.

Oh, indeed, I considered that. go is scary though ^^ I understand it enough to read the code, but the syntax feels so foreign…

stefanbenten · October 8, 2020, 9:15pm

Here is the PR that makes the error message go away

felixbrucker · October 8, 2020, 9:52pm

Ah, this is the reason why the dashboard api stopped working with a 500 error "storageNode console web error: storage node dashboard service error: trust: satellite \"118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW\" is untrusted"

Will there be an update for the docker image of storagenodes, so that the wbe ui works again?

stefanbenten · October 8, 2020, 10:07pm

The above PR was just merged. All versions from v1.14.1 on will have the fix included and should work.
As one can see via https://version.storj.io, the rollout has not yet finished (cursor is not at the end) and thus docker nodes might are not updated yet.

felixbrucker · October 9, 2020, 5:39am

Docker images need to be tagged and do not use the cursor to my knowledge, thus none of the docker nodes were using a version with the fix. I see the commit got reverted since: https://github.com/storj/tardigrade.io/pull/169

SGC · October 9, 2020, 6:24am

docker nodes are like 1 week behind windows nodes in updates, to ensure the entire network isn’t shut down by mistake or update

so updating your docker images now would have put you back at an unmodified v1.13.3 image

you can keep up with the docker releases here…

stefanbenten · October 9, 2020, 6:27am

The docker images get updated as the end of the rollout.
We would like to have kept the Satellite removed from the list, but the not working dashboard has higher priority than verbose error logs.

Sasha · November 15, 2020, 6:24am

Just a reminder I am still getting this error even though 118U is not visible on the dashboard UI.

Running 1.16.1.